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DNA SEQUENCES, VECTORS, AND FUSION 

POLYPEPTIDES FOR SECRETION OF 
POLYPEPTIDES IN FILAMENTOUS FUNGI 

FIELD OF THE INVENTION 

me present invention is directed to increased secretion of desired polypeptides from filamentous 
fungi. The invention discloses fusion nucleic acids, vectors, fusion polypeptides, and processes 
for obtaining the desired polypeptide. 

BACKGROUND OF THE INVENTION 

Production effusion polypeptides has been reported in a number of organisms, including E. coli, 
yeast, and filamentous fungi. For example, bovine chymosin and porcine pancreaUc 
prophospholipase A2 have both been produced in A. niger or A. nigervar. a^ori as fusions to 
full-length GAI (USSN 08/318,494; Ward et al., Bio/technology 8:435-440,1990; Roberts et 
al Gene 122:155-161, 1992). Human interleukin 6 (hIL6) has been produced in A. nylons as a 
fusion to full-length A^ig* GAI (Contreras et al., Biotechnology 9:378-381, 1991). Hen egg 
white lysozyme (Jeenes et al., FEMS Microbiol. Lett. 107:267-272, 1993) and human lactofernn 
(Ward et al., Bio/technology 13:498-503, 1995) have been produced in A. niger as fus.ons to 
residues 1-498 of glucoamylase and hIL6 has been produced fa,* niger as afuaonto 
glucoamylase residues 1-514 (Broelchuijsen et al., J. Biotechnol. 31:135-145, 1993). In some of 

, u • »«t c rrontreras et al 1991; Broekhuijsen et al., 1993; Ward et al., 1995) a 

the above experiments (Contreras ex hi., 1 " » •> 

KEX2 pro.e*te .cognition site (Lys, *i> ^ been ■— ««» "* * 

desired polypepride to allow in vivo release of the desired polypeptide from ,he fusion protetn as 
a result of the action of. native Aspergillus KEX2-like prore^e. 

Additional.,, bovine chymosin has been produced in A. *r . a fission whh fitll-leng* «** 
alpha-amylase (Konnan « al., Car. Gen... .7:203-2.2, 1990) and in /4. OTyzoe as a fitston w«h 
w „ca.ed forms of A. or^ae ghtcamylas. (either residues >-e03 or .-Ml; Tauchiy. ea a.., 
BloscL Biotech. Biochem. 58.895-899, 1994). 

A small protein (epidermal gntwth honnone; 53 amino acid,) haa been product in Asper^s 
as a tandem fitaion of three copies of.be protehr (US Paten. 5,2.8,093). The .rimer of EOF™ 
secreted as a result of ,he inehtsion of an N-termina. secretion signal sequence. However, ,be 
EOF molecules were no. additionally .used to a protein efficiently secreted by fi.amen.ous fung. 
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and no method for subsequent separation of monomelic EGF proteins was provided. 

The glaA gene encodes glucoamylase which is highly expressed in many strains of Aspergillus 
niger and Aspergillus awamori. The promoter and secretion signal sequence of the gene have 
been used to express heterologous genes in Aspergilli including bovine chymosin in Aspergillus 

■ 

nidulans and A. awamori as previously described (Cullen, D. et al. (1987) Bio/Technology 5, 
713-719 and EPO Publication No. 0 215 594). In the latter experiments, a variety of constructs 
were made, incorporating prochymosin cDNA, either the glucoamylase or the chymosin 
secretion signal and, in one case, the first 1 1 codons of mature glucoamylase. Maximum yields 
of secreted chymosin obtained from A awamori were below 15 mg/1 in 50 ml shake flask 
cultures and were obtained using the chymosin signal sequence encoded by pGRG3. These 
previous studies indicated that integrated plasmid copy number did not correlate with chymosin 
yields, abundant polyadenylated chymosin mRNA was produced, and intracellular levels of 
chymosin were high in some transformants regardless of the source of secretion signal. It was 
inferred that transcription was not a limiting factor in chymosin production but that secretion 
may have been inefficient. It was also evident that the addition of a small amino terminal 
segment (1 1 amino acids) of glucoamylase to the propeptide of prochymosin did not prevent 
activation to mature chymosin. The amount of extracellular chymosin obtained with the first 
eleven codons of glucoamylase, however, was substantially less than that obtained when the 
glucoamylase signal was used alone. Subsequently, it was demonstrated that chymosin 
production could be greatly increased when a fusion protein consisting of lull-length 
glucoamylase and prochymosin was produced (USSN 08/318,494; Ward et al. Bio/technology 
8:435-440, 1990). 

Aspergillus niger and Aspergillus niger var. awamori (A. awamori) glucoamylases have identical 
amino acid sequences. The glucoamylase is initially synthesized as preproglucoamylase. The pre 
and pro regions are removed during the secretion process so that mature glucoamylase is 
released to the external medium. Two forms of mature glucoamylase are recognized in culture 
supernatants: GAI is the full-length form (amino acid residues 1-616) and GAII is a natural 
proteolytic fragment comprising amino acid residues 1-512. GAI is known to fold as two 
separate domains joined by an extended linker region. The two domains are the 471 residue 
catalytic domain (amino acids 1-471) and the 108 residue starch binding domain (amino acids 
509-616), the linker region being 36 residues in length (amino acids 472-508). GAII lacks the 
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starchbinding domain. These details of glucoamylase structure are reviewed by Libby et al. 
(Protein Engineering 7:1109-1114, 1994) and are shown diagrammatically in Fig. 1. 

Trichcderma reesei produces several cellulase enzymes, including cellobiohydrolase I (CBHI), 
which are folded into two separate domains (catalytic and binding domains) separated by an 
extended linker region. Foreign polypeptides have been secreted in T. reesei as fusions with the 
catalytic domain plus linker region of CBHI (Nyyssonen et al., Bio/technology 11:591-595, 
1993). 

SUMMARY OF THE INVENTION 

An object of the invention herein is to provide for the expression and secretion of desired 
polypeptides by and from filamentous fungi including fusion nucleic acids, expression vectors 
containing such fusion nucleic acids, transformed filamentous fungi, fusion polypeptides and 
processes for expressing and secreting high levels of such desired polypeptides. 

In accordance with the above objects, the invention provides fusion nucleic acids encoding a 
fusion polypeptide comprising, from a 5' end of the fusion nucleic acid, first, second, third and 
fourth nucleic acids. The first nucleic acid encodes a signal polypeptide functional as a secretory 
sequence in a first filamentous fungus. The second nucleic acid encodes a secreted polypeptide 
or functional portion normally secreted from a filamentous fungus. The third nucleic acid 
encodes a cleavable linker and the fourth nucleic acid comprises two or more nucleic acids each 
encoding desired polypeptides. 

Further provided are fusion nucleic acids wherein the fourth nucleic acid further comprises at 
least one nucleic acid encoding a cleavable linker. The nucleic acids encoding said desired 
polypeptides are separated by the nucleic acid encoding the cleavable linker. 

In another aspect, the invention provides a fusion nucleic acid encoding a fusion polypeptide 
comprising, from a 5' end of the fusion nucleic acid, first, fifth, third and second nucleic acids. 
The fifth nucleic acid comprises at least one nucleic acid encoding a desired polypeptide. 

In a further aspect the invention provides a fusion polypeptide comprising a first nucleic acid, a 
second nucleic acid, and an insertion nucleic acid. The insertion nucleic acid comprises a fifth 
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nucleic acid flanked by third nucleic acids. 

Also provided are expression vectors for transforming a host filamentous fungus comprising 
nucleic acids encoding regulatory sequences functionally recognized by said host filamentous 
fungus including promoter and transcription and translation initiation sequences operably linked 
to the 5 1 end of the fusion nucleic acids described herein. 

Although cleavage of the fusion polypeptide to release the desired polypeptide will often be 
useful, it is not necessary. In those cases in which the desired polypeptide retains its functionality 
when it is part of the fusion, polypeptide cleavage may not be required or desirable. For this 
reason, said third amino acid sequence comprising the cleavable linker may be considered 
optional. Also provided are filamentous fungi containing a fusion nucleic acid described herein. 

Additionally provided are fusion polypeptides comprising, from the amino-to caboxy-terminus, 
first, second, third and fourth amino acid sequences. The first amino acid sequence comprises a 
signal polypeptide functional as a secretory sequence in a filamentous fungus. The second amino 
acid sequence comprises a secreted polypeptide or functional portion normally secreted from a 
filamentous fungus. The third amino acid sequence comprises a cleavable linker and the fourth 
amino acid sequence comprises at least two desired polypeptides. 

Further provided are fusion polypeptides comprising, from the amino-to caboxy-terminus, first, 
fifth, third and second amino acids. The fifth polypeptide comprises at least one desired 
polypeptide. 

Also provided are fusion polypeptides comprising a first amino acid sequence, a second amino 
acid sequence, and an insertion amino acid sequence. The insertion amino acid sequence 
comprises a fifth amino acid sequence flanked by third amino acid sequences, and is inserted into 
the second amino acid sequence. 

* 

Additionally provided are processes for producing a desired polypeptide comprising transforming 
a host filamentous fungus with an expression vector containing a fusion nucleic acid described 
herein under conditions which permit expression of the fusion nucleic acid to cause the secretion 
of the desired polypeptide encoded by the fusion nucleic acid. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1A and IB depict the structure of A. niger glucoamylases. Figure 1 A depicts GAI, and 
Figure IB depicts GAEL 

Figure 2A, 2B, 2C, 2D, and 2E depict five plasmids containing synthetic DNA with the 
imp ortant restriction sites and amino acids encoded by the oligonucleotides. 

Figure 3 A and 3B depict two plasmids containing synthetic DNA with the important restriction 
sites and amino acids encoded by the oligonucleotides. 

Figure 4A and 4B depict two plasmids containing synthetic DNA with the important restriction 
sites and amino acids encoded by the oligonucleotides. 

Figure 5A and 5B depict two plasmids containing synthetic DNA with the important restriction 
sites and amino acids encoded by the oligonucleotides. 

Figure 6A and 6B depict two plasmids containing synthetic DNA with the important restriction 
sites and amino acids encoded by the oligonucleotides. 

• • «. cmthPtir DNA with the important restriction sites and 
Figure 7 depicts a plasmid containing synthetic UNA wnn me i v 

amino acids encoded by the oligonucleotides. 

Figure 8A and 8B depict two plasmids containing synthetic DNA with the important restriction 
, sites and amino acids encoded by the oligonucleotides. 

Figure 9 depicts the pGA plasmids containing the A. ni g er var. avamori g IaA promoter and 
coding region for amino acids 1-498 of mature glucoamylase and sequences encoding desired 
polypeptides positioned at the 3' end. 

r^.Odepicu^pGp.asn.idscomaining.h^.^va,.^^ proband 
coiing region for amino adds 1-468 of mature glueoamylase and sequences encod,ng deareo 
polypeptides positioned at the 3' end. 
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Figure 1 1 depicts the pGAXS plasmids containing the A. niger var. awamori glaA promoter and 
coding region for amino acids 1-616 of mature glucoamylase and with sequences encoding 
desired polypeptides inserted after the codon for amino acid 498 of mature glucoamylase. 

Figure 12 depicts pEpiGA containing the A. niger var. awamori glaA promoter and epitope 
positioned at the 5' end of the coding region for amino acids 1-498 of mature glucoamylase. 

Figure 13 depicts pEN603, pEN604 and pEN701 containing the coding region for Trichoderma 
longibrachiatum CBHI up to codon 459 of mature CBHI and sequences encoding desired 
polypeptides positioned at the 3' end. 

Figure 14 depicts pTIC, a Trichoderma longibrachiatum expression vector containing the 
promotor and terminator regions of the cbhl gene and the pyr4 gene as a delectable marker. 

Figure 15 depicts pEN605, pEN606, pEN702 which are vectors designed for expression in 
Trichoderma longibrachiaturm of fusion proteins containing CBHI up to amino acid 459 with 
desired polypeptides fused at the carboxyl terminus. 

DETAILED DESCRIPTION 

It was previously discovered that desired polypeptides can be expressed and secreted at levels 
higher than that previously obtained by fusing the desired polypeptide with a functional 
polypeptide which is normally secreted from a filamentous fungus. It has been previously shown 
that heterologous polypeptides such as bovine chymosin, glucoamylase and carboxyl (= aspartyl) 
protease from filamentous fungi could be expressed and secreted born Aspergillus species as 
described in U.S. Patent No. 5,364,770, EPO Publication No. 0 215 594, and WO 90/15860, 
each of which are expressly incorporated herein by reference. 

In a preferred embodiment, the present invention provides improved fusion nucleic acids 
comprising first, second, third and fourth nucleic acids. The first nucleic acid encodes a signal 
polypeptide functional as a secretory sequence in a first filamentous fungus. The second nucleic 
acid encodes a secreted polypeptide or functional portion of a secreted polypeptide that is 



0 
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„* *° m • fl, «" OT,0US Ito '* M " ocleic acW B,c ° des " c,wvab,e 

and ,he fourth nucleic acid encodes a. least two desired polypepudes. 
A, used herein, a "fusion nudeic acid" emprises a number of nucleic acids operably M 
ronemer, as describe, herein. By "nuddc add" herein is mean, a, leas, two »*«•*• 
JL* together. The nudeic add may be DNA bod, genomic and cDNA, or RNA, or 
, hybrid of KNA and DNA. Preferred nuddc add is DNA 

Tk. -M nuddc add' encodes a signal polypoid, funcriond as a secretory sequence to a firs, 
filamentous fungus. Such signal sequences todude .hose from gluco.my.ase, a-amylas. and 

sspanyl pro.^ from Asp^Kus ^ S " pa ' 

fences from cdlobiohydrotos. 1, cellobiohydrol.se n, endoglucanase I, endoglucanase n, 

m from IMta signd sequences from glucoamytose from »«*~ and 
Humicola as wdl as signal sequences from eukaryo.es todudtog .he signd sequence from bovm. 
ch^osto, hum*, rissue ptosmtoogen ac.iva.or, human imerferon m d syn.he.ic _> 
J^fic signal sequences such as .ha. deacribed b, Gwynn. « d. (.9.7) ^. Parficu 
preferred dgna! sequences are .hose derived from polypeptides seeded by rite express™ hos. 
»| « express and secrete .he fusion polypepride. For example, the rignd sequence from 

.^^e from ink — ' * »■*»- -" ,reSSi " 8 ^ SeC,e "" g * 
polypeptide from Aspergillus awamori. 

Asusedh^n.fi-at.mlnoaddse.uencesco.espondtosecre.onrsequ^ceswhichare 
fimcriona. to a fi.am.n.ous fimgus. Such amino add sequences are encoded b, firs, nude* .ads 
as defined. 

As used herein, -second nuddc acids" encode dl or pm of "s^rCeu polypeptides" normal* 
jessed from filamemous fungi. Such seceeted polypeptides todude gtocoamytase. a-amyl»se 

Asper^s ar^e, cellobiohydrotose 1, cdlobiohydroto II, endoglucanase I and endoglucanase 

* .he fir* nucleic adds, preferred secreted pofypepfidea are .hose which are nannafiy secreted 
bv ri.efi.amen.ousfi.nga. expression hos,. Thus, for exampfe when using ^fe V 
« prefer^ secre<ed po.ypep.ides are gluco»my.a» and a-my.ase from Asperg.Uus 

niger var. awamori, most preferably glucoamylase. 
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As indicated, all or part of the mature sequence of the secreted polypeptide is used in the 
construction of the fusion DNA sequences. In one embodiment, full length secreted polypeptides 
are used. However, functional portions of the secreted polypeptide may be employed. As used 
herein a "portion" of a secreted polypeptide may be defined in one of two ways. Preferably, a 
portion of a secreted polypeptide is defined functionally as that portion of a secreted polypeptide 
which when combined with the other components of the fusion polypeptide defined herein results 
in increased secretion of the desired polypeptide as compared to the level of desired polypeptide 
secreted when an expression vector is used which does not utilize the secreted polypeptide. 
Thus, the secretion level of a fusion DNA sequence encoding first, second, third and fourth 
amino acid sequences (the second DNA sequence containing all or a portion of a secreted 
polypeptide) is compared to the secretion level for a second fusion polypeptide containing only 
first, third and fourth amino acid sequences (i.e., without a secreted polypeptide or a portion 
thereof). Those amino acid sequences from the secreted polypeptide, and DNA sequences 
encoding such amino acids, which are capable of producing increased secretion as compared to 
the second fusion polypeptide comprise the "portion" of the secreted polypeptide as defined 
herein. 

A "functional portion of a secreted polypeptide" or grammatical equivalents may also mean a 
truncated secreted polypeptide that retains its ability to fold into a normal, albeit truncated, 
configuration. For example, in the case of bovine chymosin production by A. niger var. 
awamori it has been shown that fusion of prochymosin following the 1 1th amino acid of mature 
glucoamylase provided no benefit compared to production of preprochymosin (US patent 
5,364,770). In USSN 08/318,494, it was shown that fusion of prochymosin onto the C-terminus 
of preproglucoamylase up to the 297th amino acid of mature glucoamylase plus a repeat of 
amino acids 1-11 of mature glucoamylase yielded no secreted chymosin in A. niger war. awamori. 
In the latter case it is unlikely that the portion (approximately 63%) of the glucoamylase catalytic 
domain present in the fusion protein was able to fold correctly so that an aberrant, mis-folded 
and/or unstable fusion protein may have been produced which could not be secreted by the cell. 
The inability of the partial catalytic domain to fold correctly may have interfered with the folding 
of the attached chymosin. Thus, it is likely that sufficient residues of a domain of the naturally 
secreted polypeptide must be present to allow it to fold in its normal configuration independently 
of the desired polypeptide to which it is attached. Evans et al. (Gene 91 : 13 1-134, 1990) showed 
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that a form of A. niger glucoamylase truncated at its C-terminus back to amino acid 460 in the 
catalytic domain was not efficiently secreted by yeast. However, the relevance of this to secret™ 
in Aspergillus is not known. 

In most cases, the portion of the secreted polypeptide will be both correctly folded and resuh in 
increased secretion as compared to its absence. 

Similarly, in most cases, the truncation of the secreted polypeptide means that the functional 
portion retains a biological function. In a preferred embodiment, the catalytic domain of a 
secreted polypeptide is used, although other functional domains may be used, for example, the 
substrate binding domains. In the case of Aspergillus niger and Aspergillus niger var. awamon 
glucoamylase, preferred functional portions retain the catalytic domain of the enzyme, and 
include amino acids 1-471. Additionally preferred embodiments utilize the catalytic domam and 
all or part of the linker region. Alternatively, the starch binding domain of glucoamylase may be 
used, which comprises amino acids 509-616 of Aspergillus niger and Aspergillus niger var. 
awamori glucoamylase. 

In a preferred embodiment, the functional portion of the secreted polypeptide includes a linker 
region. The salient features of a linkerregion are that it forms an extended, semi-rig,d spacer 
between independently folded domains. A linker region between the desired polypeptide and the 
functional domain of the naturally secreted fungal polypeptide may be beneficial in allowxngthe 
two polypeptides to fold independently. As is known in the art and discussed above, a number of 
secreted polypeptides are known to contain linkers, including hydrolases such as bactenal and 
fungal cellulases and hemicellulases (reviewed in Libby et a.., 1994, supra). Preferred linkers are 
from glucoamylase from Aspergillus species and CBHI linkers from Tricoderma spec.es. 
Preferably, the linker and the functional portion of the secreted polypeptide are from the same 
protein, although this is not required. 

In an alternative embodiment, the secreted polypeptide does not include a terminal linker region. 
. For example, when the full length secreted polypeptide is utilized as the second polypept.de, the 
desired polypeptide is fused to the C-terminus of the second polypeptide in the absence of an 
additional linker, although the secreted polypeptide may contain an internal linker. Sinularly, a 
functional domain such as a catalytic domain may be used without a linker region as well. . 
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Generally, when a functional portion of the secreted polypeptide is used as the second 
polypeptide, it is C-terminally truncated, i.e. contains an intact N-terminus. Ih alternate 
embodiments, the secreted polypeptide is N-terminally truncated, or optionally truncated at both 
ends to leave a functional portion. 

In some cases, such as glucoamylase or proteases, the second polypeptide may be naturally 
produced as a proprotein which is subsequently processed by removal of the prosequence to 
yield the mature protein. In these cases, the prosequence naturally associated with the second 
polypeptide may or may not be included in the fusion polypeptide. 

In some cases, such as glucoamylase or proteases, the second polypeptide may be naturally 
produced as a proprotein which is subsequently processed, by removal of the prosequence, to 
yield the mature protein. In these cases, the prosequence naturally associated with the second 
polypeptide may or may not be included in the fusion polypeptide. 

Generally, such portions of the secreted polypeptide comprise greater than 50% of the secreted 
polypeptide, preferably greater than 75%, most preferably greater than 90% of the secreted 
polypeptide. Such portions comprise preferably the amino-terminal portion of the secreted 
polypeptide. 

As indicated, the first nucleic acid encodes a signal polypeptide functional as a secretory signal in 
a first filamentous fungus. The signal sequences may be derived from a secreted polypeptide 
from a particular species of filamentous fungus. As also indicated, the second nucleic acid 
encodes a second amino acid sequence corresponding to all or part of a polypeptide normally 
secreted by either the first filamentous fungus (from which the signal polypeptide is obtained) or 
a second filamentous fungus (if the signal polypeptide and secreted polypeptide are from different 
filamentous fungi or if the signal polypeptide is obtained from a source other than a filamentous 
fungus, e.g. the chymosin signal from bovine species). 

As used herein, "third nucleic acids" comprise nucleic acids encoding a cleavable linker 
polypeptide, which under certain circumstances, will allow the separation of the sequences 
bordering the cleavable linker. For example, sequences that are recognized and cleaved by a 
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protease or cleaved after exposure to certain chemicals are considered cleavable linkers. For 
example, cleavable linkers include, but are not limited to, the prosequence of bovine chymosm, 
the prosequence of subtilisin, prosequences of retroviral proteases including human 
immunodeficiency virus protease and sequences recognized and cleaved by trypsin (EP 578472, 
Takasuga et al, J. Biochem. 1 12(5)652 (1992)) factor X. (Gardella et al., J. Biol. Chem. 
265(26):15854 (1990), WO 9006370), collagenase (J03280893, Tajima et al., J. Ferment. 
Bioeng. 72(5):362 (1991), WO 9006370), clostripain (EP 578472), subtilisin (including mutant 
H64A subtilisin, Forsberg et al., J. Protein Chem. 10(5):517 (1991), chymosin, yeast KEX2 
protease (Bourbonnais et al., J. Bio. Chem. 263(30):15342 (1988), thrombin (Forsberg et al., 
supra; Abath et al., BioTechniques 10(2):178 (1991)), Staphylococcus aureus V8 protease or 
simile endoproteinase-Glu-C to cleave after Glu residues (EP 578472, Ishizaki et al., Appl. 
Microbiol. Biotechnol. 36(4):483 (1992)), cleavage by NIa proteainase of tobacco etch virus 
(Parks et al., Anal. Biochem. 216(2):413 (1994)), endoproteinase-Lys-C (U.S. Patent No. 
4,414,332) and endoproteinase-Asp-N, Neisseria type 2 IgA protease (Pohlner et al., 
Biotechnology 10(7):799-804 (1992)), soluble yeast endoproteinase yscF (EP 467839), 
chymotrypsin (Altman et al., Protein Eng. 4(5):593 (1991)), enteropeptidase (WO 9006370), 
lysostaphin, a polyglycine specific endoproteinase (EP 316748), and the like. See e.g. Marston, 
F AO. (1986) Biol. Chem. J. 240, 1-12. Particular amino acid sites that serve as chemical 
cleavage sites include, but are not limited to, methionine for cleavage by cyanogen bromide 
(Shen, PNAS USA 81:4627 (1984); Kempe et al., Gene 39:239 (1985); Kuliopulos et al., J. Am. 
Chem. Soc. 116:4599 (1994); Moks et al., Bio/Technology 5:379 (1987); Ray et al., 
Biotechnology 11:64 (1993)), acid cleavage of an Asp-Pro bond (Wingender et al., J. Biol. 
Chem. 264(8):4367 (1989); Gram et al., Bio/Technology 12:1017 (1994)), and hydroxylamine 
cleavage at an Asn-Gly bond (Moks supra). 

* i 

Although cleavage at the fusion polypeptide to release the desired polypeptide will often be 
useful, it is not necessary. In those cases in which the desired polypeptide retains its funct.onahty 
when it is part of the fusion polypeptide, cleavage may not be required or desirable. For this 
reason, the third amino acid sequence (and third nucleic acid) comprising the cleavable linker 
may be optional. 

It should be understood that the third nucleic acid need only encode that amino acid sequence 
which is necessary to be recognized by a particular enzyme or chemical agent to bring about 
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cleavage of the fusion polypeptide. Thus, the entire prosequence o£ for example, chymosin or 
subtilisin need not be used. Rather, only that portion of the prosequence which is necessary for 
recognition and cleavage by the appropriate enzyme is required. 

Particularly preferred cleavable linkers are the KEX2 protease recognition site (Lys-Arg), which 
can be cleaved by a native Aspergillus KEX2-like protease, trypsin protease recognition sites of 
Lys and Arg, and the cleavage recognition site for endoproteinase-Lys-C. 

As used herein, "fourth nucleic acids" encode "desired polypeptides." "Polypeptides" as used 
herein include proteins and peptides. Such desired polypeptides include, but are not limited to, 
enzymes, hormones, growth factors, cytokines, structural proteins and plasma proteins. Thus, 
preferred desired polypeptides include, but are not limited to, bovine chymosin, human tissue 
plasminogen activator, human growth hormone, human interferon, human interleukin, human 
serum albumin, bacterial enzymes such as a-amylase from Bacillus species and lipase from 
Pseudomonas species, etc. Desired polypeptides further include fungal enzymes such as lignin 
peroxidase and Mn 2+ -dependent peroxidase from Phanerochaete, glucoamylase from Humicola 
species and aspartyl proteases from Mucor species. Smaller polypeptides are also preferred, as is 
more generally outlined below, such as the anti-coagulant Hirudin and analogs such as 
R3-hirulog. Other suitable desired polypeptides include amylins and amylin antagonists, 
calcitonin, calcitonin gene-related peptides, glucagon, glucagon-like peptides, aprotinin, anti- 
bacterial peptides such as magainins, defensins and protegrins, somatostatin, atrial natriuretic 
peptide, brain natriuretic peptide, integrelin, epidermal growth factor, transforming growth factor 
oc, insulin-like growth factor, various protein epitopes for use as vaccines. 

In a preferred embodiment, the fourth nucleic acid comprises at least two coding nucleic acids 
each of which encode desired polypeptides. Generally, in this embodiment, the polypeptides 
range in size from about 2 to about 100 amino acids in length, with from about 10 to about 50 
being preferred, and from about 15 to about 40 being particularly preferred. In alternative 
embodiments, the polypeptides are larger, i.e. are considered proteins. Additionally, from about 
two to about 20 desired polypeptides are used, with from about 2 to about 10 being preferred 
and from about 2 to about 5 being particularly preferred. 
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In this embodiment, the desired polypeptides may be the same polypeptide, i.e. identical 
polypeptides, or different polypeptides. That is, in one embodiment, the fourth nucleic acid 
comprises nucleic acids encoding multiple copies of a particular polypeptide. Alternatively, the 
fourth nucleic acid may encode single copies of several polypeptides, or multiple copies of 
several polypeptides. 

When the fourth nucleic acid encodes at least two desired polypeptides, the coding regions for 
the polypeptides may be linked together directly, i.e. without intervening sequences such as those 
encoding cleavable linkers. For example, certain desired polypeptides may retain biological 
activity in a multimeric form. See for example U.S. Patent No. 5,21 8,093, which describes the 
production of tandemly linked units of epidermal growth factor (EGF), which are produced as a 
single molecule. 

In a preferred embodiment, the coding regions for the desired polypeptides of the fourth nucleic 
acid are separated by at least one intervening sequence, for example, an intervening sequence 
that encodes a cleavable linker. Preferably, each coding region for a desired polypeptide is 
separated from the next one by a nucleic acid encoding a cleavable linker. That is, when the 
fourth nucleic acid encodes two desired polypeptides, the coding region for the cleavable linker is 
positioned between the nucleic acids encoding the desired polypeptides. Similarly, a fourth 
nucleic acid encoding three desired polypeptides will have a cleavable linker sequence between 
each of the desired polypeptide sequences, i.e . two cleavable linker sequences; a fourth nucleic 
acid encoding four desired polypeptides will have three cleavable linker sequences, i.e. a 
cleavable linker sequence between each of the desired polypeptide sequences; a fourth nucleic 
acid encoding five desired polypeptides will have four cleavable linker sequences, i.e. a cleavable 
linker sequence between each of the desired polypeptide sequences, and so on. 

In this embodiment, the cleavable linkers between the desired polypeptides may be the same or 
different from each other or the cleavable linker encoded by the third nucleic acid. That is, in one 
embodiment, the third nucleic acid encodes the same cleavable linker as is present between the 
desired polypeptides. Subjecting the fusion protein to the cleavage conditions releases all of the 
desired polypeptides. Alternatively, the cleavable linkers are different. That is, cleavage at the 
cleavable linker encoded by the third nucleic acid releases the fourth polypeptide from the 
secreted polypeptide but does not release the separate desired polypeptides encoded by the 
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fourth nucleic acid. In some embodiments, the cleavable linkers between the desired 
polypeptides encoded by the fourth nucleic acids are different. 

In an additional embodiment, the fourth nucleic acid farther comprises a sequence encoding a 
polypeptide that facilitates purification of the fusion protein or desired polypeptides. The fourth 
nucleic acid may contain multiple copies of a nucleic acid encoding a purification polypeptide; for 
example, when the fourth nucleic acid comprises more than one desired polypeptide coding 
region, each nucleic acid encoding a desired polypeptide may additionally contain a coding 
region for a purification polypeptide. Thus, after cleavage or separation of the desired 
polypeptides, each desired polypeptide may utilize the exogeneous sequence for purification. 
Alternatively, there may be only one copy of the nucleic acid encoding a polypeptide used for 
purification per fusion nucleic acid. In this embodiment, for example, the purification 
polypeptide is used to purify the entire fusion polypeptide which then may be cleaved into a 
number of desired polypeptides if required. If necessary, the purification polypeptide may also be 
separated from the desired polypeptides by a cleavage polypeptide. 

For example, sequences encoding specific epitopes may be included to allow purification via 
antibody columns, such as are known in the art. Similarly, sequences encoding proteins which 
bind to affinity chromatography columns may also be used. For example, Kuliopulos and Walsh 
(J. Am. Chem. Soc. 1 16:4599 (1994)) produced a fusion protein which comprised a bacterial 
protein, a tandem repeat of a 13 amino acid peptide, and a carboxyl tag of six histidine residues. 
The (His) 6 tag allowed the fusion protein to be purified using Ni chelate chromatography. It is 
also possible to fase peptides to proteins such as glutathione transferase (GST) or maltose 
binding protein and exploit the affinity purification methods available for these and other proteins 
in purification of the desired peptides. 

In a preferred embodiment, the present invention provides fusion nucleic acids comprising, from 
5' to 3', first, fifth, third and second nucleic acids. In this embodiment, the desired polypeptides 
are produced as N-terminal fusions rather than C-terminal, as described above. The first, third 
and second nucleic acids are as described above. A "fifth nucleic acid" encodes at least one 
desired polypeptide, as described above. Thus, in a preferred embodiment, a fifth nucleic acid 
encodes a single desired polypeptide. In alternate preferred embodiments, a fifth nucleic acid 
comprises at least two coding nucleic acids each of which encode desired polypeptides, as 
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described above for fourth nucleic acids. When the fifth nucleic acid encodes more than one 

m 

desired polypeptide, the fifth nucleic acid may also comprise intervening sequences such as third 
nucleic acids encoding cleavable linkers, as described above for fourth nucleic acids. 

In a further preferred embodiment, the present invention provides fusion nucleic acids comprising 
from 5' to 3', a first nucleic acid and a second nucleic acid. The second nucleic acid comprises a 
first and a second part, separated by an insertion point. At the insertion point, an insertion 
nucleic acid is present. An "insertion nucleic acid" comprises a fifth nucleic acid and in some 
embodiments is flanked by third nucleic acids comprising cleavable linkers. That is, the fifth 
nucleic acid is inserted into the second nucleic acid with a third nucleic acid at the 5' end and 
another third nucleic acid at the 3' end of the fifth nucleic acid. In this embodiment, it is 
preferred that the fifth nucleic acid comprise a single nucleic acid encoding a desired polypeptide, 
although in alternate embodiments, additional nucleic acids encoding desired polypeptides and 
additional cleavable linkers may be included, as will be appreciated by those in the art and 
generally described above. In this embodiment, it is preferred that the second nucleic acid 
encode a secreted polypeptide that contains a linker region or domain, and that the insertion 
point be within this linker region. That is, as described above, some secreted polypeptides have 
functional domains separated by linker regions (not to be confused with cleavable linkers). Thus, 
preferred embodiments utilize secreted polypeptides that contain at least one functional domain 
and all or part of the linker region. Preferably, the entire secreted polypeptide including the 
linker region is used. The insertion point may be located anywhere within the linker region. 

The above-defined nucleic acids encoding the corresponding amino acid sequences are combined 
to form a fusion nucleic acid. As so assembled, the nucleic acid will encode a "fusion 
polypeptide". In one embodiment, the fusion polypeptides comprise from its amino-terminus: a 
first amino acid sequence comprising a signal polypeptide functional as a secretory sequence in a 
filamentous fungus, a second amino acid sequence comprising a secreted polypeptide or 
functional portion thereof normally secreted from a filamentous fungus, a third amino acid 
sequence comprising a cleavable linker and a fourth amino acid sequence comprising at least two 
desired polypeptides. As will be appreciated by those in the art, the invention also includes 
polypeptides from which the signal sequence has been cleaved from the fusion polypeptide. 
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In an additional embodiment, the fusion polypeptides comprise from its amino-terminus: a first 
amino acid sequence comprising a signal polypeptide functional as a secretory sequence in a 
filamentous fungus, a fifth amino acid sequence comprising at least one desired polypeptides, a 
third amino acid sequence comprising a cleavable linker and a second amino acid sequence 
comprising a secreted polypeptide or functional portion thereof normally secreted from a 
filamentous fungus. As above, the invention also includes polypeptides from which the signal 
sequence has been cleaved from the fusion polypeptide. 

In a further embodiment, the fusion polypeptide comprises from its amino terminus, a first amino 
acid sequence comprising a signal polypeptide functional as a secretory sequence in a filamentous 
fungus, and a second amino acid sequence. The second amino acid sequence comprises a first 
and a second part, separated by an insertion point. An insertion amino acid, comprising a fifth 
amino acid sequence flanked by third amino acid sequences, is inserted between the first and 
second parts of the second amino acid sequence, at the insertion point. 

The fusion nucleic acids of the invention are constructed using well known techniques as is 
generally described in EPO Publication 0 215 594, incorporated by reference. Once made, the 
fusion nucleic acids are incorporated into any number of expression vectors as is known in the 
art. The expression vectors preferably contain regulatory sequences functional in the host to be 
transformed. Thus, in the case of filamentous fungus, such regulatory sequences include, but are 
not limited to, the transcriptional and translational regulatory sequences which may include, but 
are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, polyadenylation signals, translational start and stop sequences, and enhancer or 
activator sequences. In a preferred embodiment, the regulatory sequences include a promoter 
and transcriptional start and stop sequences. As is appreciated in the art, the promoter and 
transcription and translation initiation sequences are operably linked to the 5' end of the fusion 
nucleic acid, and the transcription termination and polyadenylation sequences are operably linked 
to the 3 1 end of the fusion nucleic acid. 

"Operably linked" in this context means that the transcriptional and translational regulatory DNA 
is positioned relative to the coding sequences in such a manner that transcription is initiated. 
Generally, this will mean that the promoter and transcriptional initiation or start sequences are 
positioned 5 1 to the coding region. The transcriptional and translational regulatory nucleic acid 
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will generally be appropriate to the host cell used to express the fusion protein; for example, 
transcriptional and translation^ regulatory nucleic acid sequences from Aspergillus will be used 
to express the fusion protein in Aspergillus. Numerous types of appropriate expression vectors, 
and suitable regulatory sequences are known in the art for a variety of host cells. 

Promoter sequences encode either constitutive or inducible promoters. The promoters may be 
either naturally occurring promoters or hybrid promoters. Hybrid promoters, which combine 
elements of more than one promoter, are also known in the art, and are useful in the present 

invention. 

As used herein, a "promotor sequence" is a nucleic acid which is recognized by the particular 
filamentous fungus for expression purposes. It is operably linked to a nucleic acid encoding the 
above defined fusion polypeptide. Such linkage comprises positioning of the promoter with 
respect to the translation initiation codon of the nucleic acid encoding the fusion nucleic acid. 
The promoter sequence contains transcription and translation control sequences which mediate 
the expression of the fusion nucleic acid. Examples include the promoter from the A. awamori 
or A. niger glucoamylase genes (Nunberg, j.H. et al. (1984) Mol. Cell. Biol. 4, 2306-23 15; 
Boel, E. etal. (1984) EMBO J. 3, 1581-1585), the ,4. niger and ,4. oryzae alpha-amylase genes 
(Korman et al. (1990) Cum Genet 17, 203-212; Wirsel et al. (1989) Mol. Microbiol. 3, 3-14; 
Gines et al. (1989) Gene 79, 107-117; Tada et al. (1989). Biol. Agric Chem. 53, 593-599), the 
Mucor miehei carboxyl protease gene (EPO 0215594), the Trichoderma reesei cellobiohydrolase 
I gene (Shoemaker, S.P. et al (1984) European Patent Application No. EPO0137280A1), the 
A. nidulans trpC gene (Yelton, M. etal. (1984) Proc Natl. Acad. Sci. USA 81, 1470-1474; 
Mullaney, E.J. et al (1985) Mol. Gen. Genet 199, 37-45) the A nidulans alcA gene 
(Lockington, R.A. et al. (1986) Gene 33, 137-149), the A. nidulans tplA gene (McKnight, G.L. 
etxtl. (1986) Cell 46, 143-147), the A. nidulans amdS gene (Hynes, M.J. et al. (1983) Mol. 
Cell Biol. 3, 1430-1439), and higher eukaryotic promoters such as the SV40 early promoter 
(Barclay, S.L. and E. Meller (1983) Molecular and Cellular Biology 3, 21 17-2130). 

Likewise a "terminator sequence" is a nucleic acid which is recognized by the expression host to 
terminate transcription. It is operably linked to the 3' end of the fusion DNA encoding the fusion 
polypeptide to be expressed. Examples include the terminator from the A. nidulans trpC gene 
(Yelton, M. et al. (1984) Proc Natl. Acad. Sci. USA 81, 1470-1474; Mullaney, E.J. et al 
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(1985) Mol. Gen. Genet 199, 37-45), the A. awamori or A. niger glucoamylase genes 
(Nunberg, J.H. etal (1984) Mol. Cell. Biol. 4, 2306-253; Boel, E. etal (1984) EMBO J. 3, 
1581-1585), the A, niger or A. oryzae alpha-amylase genes (Korman et al. supra; Wirsel et al. 
supra; Gines et al. supra; Tada et al. supra), and the Mucor miehei carboxyl protease gene (EPO 
Publication No. 0 215 594), although any fungal terminator is likely to be functional in the 
present invention. 

A "polyadenylation sequence 11 is a nucleic acid which when transcribed is recognized by the 
expression host to add polyadenosine residues to transcribed mRNA. It is operably linked to the 
3' end of the fusion DNA encoding the fusion polypeptide to be expressed. Examples include 
polyadenylation sequences from the A, nidulans trpC gene (Yelton, supra; Mullaney, supra), the 
A. awamori or A. niger glucoamylase genes (Nunberg, supra; Boel, supra), and the 
Mucor miehei carboxyl protease gene described above. Any fungal polyadenylation sequence, 
however, is likely to be functional in the present invention. 

In addition, the expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
organisms, for example in fungal cells for expression and in a procaryotic host for cloning and 
amplification. Furthermore, for integrating expression vectors, the expression vector may 
contain at least one sequence homologous to the host cell genome, and preferably two 
homologous sequences which flank the expression construct. The integrating vector may be 
directed to a specific locus in the host cell by selecting the appropriate homologous sequence for 
inclusion in the vector. Constructs for integrating vectors are well known in the art. 

In addition, in a preferred embodiment, the expression vector contains a selectable marker gene 
to .allow the selection of transformed host cells. Selection genes are well known in the art and 
will vary with the host cell used. 

The fusion proteins of the present invention are produced by culturing a host cell transformed 
with an expression vector containing fusion nucleic acid, under the appropriate conditions to 
induce or cause expression of the fusion protein. The conditions appropriate for fusion protein 
expression will vary with the choice of the expression vector and the host cell, and will be easily 
ascertained by one skilled in the art through routine experimentation. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
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proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest is 
important. 

Appropriate host cells include filamentous fungal cells. The "filamentous fungi" of the present 
invention, which serve both as the expression hosts and the source of the first and second nucleic 
acids, are eukaryotic microorganisms and include all filamentous forms of the subdivision 
Eumycotina, Alexopoulos, C.J. (1962), Introductory Mycology, New York: Wiley. These 
fungi are characterized by a vegetative mycelium with a cell wall composed of chitin, glucans, 
and other complex polysaccharides. The filamentous fungi of the present invention are 
morphologically, physiologically, and genetically distinct from yeasts. Vegetative growth by 
filamentous fungi is by hyphal elongation. In contrast, vegetative growth by yeasts such as 
S. cerevisiae is by budding of a unicellular thallus. S. cerevisiae has a prominent, very stable 
diploid phase whereas, diploids exist only briefly prior to meiosis in filamentous fungi like 
Aspergilli and Neurospora. S. ceriviae has 17 chromosomes as opposed to 8 and 7 for 
A. nidulans and N. crassa respectively. Recent illustrations of differences between S. cerevisiae 
and filamentous fungi include the inability of S. cerevisiae to process Aspergillus and 
Trichoderma introns and the inability to recognize many transcriptional regulators of filamentous 
fungi (Innis, MA et al (1985) Science, 228, 21-26). 

■ 

Various species of filamentous fungi may be used as expression hosts including the following 
genera: Aspergillus, Trichoderma, Neurospora, PeniciUium, Cephalosporium, Achlya, 
Phanerochaete, Podospora, Endothia, Mucor, Fusarium, Humicota, Cochliobolus and 
Pyricularia. Specific expression hosts include A. nidulans, (Yelton, M., et al. (1984) Proc. Natl. 
Acad. Sci. USA 81, 1470-1474; Mullaney, E.J. et al (1985) Mol. Gen. Genet. 199, 37-45; 
John, M.A. and J.F. Peberdy (1984) Enzyme Microb. TechnoL 6, 386-389; Tilburn, et al. 
(1982) Gene 26, 205-221; Ballance, D.J. et al., (1983) Biochem. Biophys. Res. Comm. 112, 
284-289; Johnston, I.L. et al. (1985) EMBO J. 4, 1307-131 1) A. niger, (Kelly, J.M. and M. 
Hynes (1985) EMBO 4, 475-479) A. awamori, e.g., NRRL 3112, ATCC 22342, ATCC 44733, 
ATCC 1433 1 and strain UVK 143f, A. oryzae, e.g., ATCC 1 1490, N. crassa (Case, M.E. et al. 
(1979) Proc. Natl. Acad. Scie. USA 76, 5259-5263; Lambowitz U.S. Patent No. 4,486,553; 
Kinsey, J. A and J. A. Rambosek (1 984) Molecxdar and Cellular Biology < 1 17-1 22; Bull, JJi. 
and J.C. Wooton (1984) Nature 310, 701-704), Trichoderma reesei, e.g. NRRL 15709, ATCC 
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13631, 56764, 56765, 56466, 56767, and Trichoderma viride, e.g., ATCC 32098 and 32086. A 
preferred expression host is A awamori in which the gene encoding the major secreted aspartyl 
protease has been deleted. The production of this preferred expression host is described in 
United States Patent Application Serial No. 214,237 filed July 1, 1988, expressly incorporated 
herein by reference. 

The fusion nucleic acids are expressed as is known in the art to produce fusion proteins. In one 
embodiment, the fusion proteins comprise a signal polypeptide, a secreted polypeptide or 
functional portion thereof, one or more cleavable linkers and one or more desired polypeptides, 
combined as herein described. As will be appreciated by those in the art, the signal polypeptide 
may be cleaved off during expression to form a fusion protein, comprising a secreted polypeptide 
or functional portion thereof, a cleavable linker and at least two desired polypeptides, for 
example when the fusion polypeptide comprises (from N-terminus to C-terminus) a first, second, 
third and fourth polypeptide. The cleavable linker may then be cleaved using techniques known 
in the art, to release the desired polypeptide(s), with additional cleavage between multiple desired 
polypeptides done as required. The actual method of cleavage will depend on the cleavable 
linker selected. 

In some embodiments, after cleavage via chemicals or endoproteinases, the desired polypeptides 
contain unwanted amino acids from the amino or carboxy termini. In this embodiment, a variety 
of aminopeptidases and carboxypeptidases of differing specificities may be used to remove the 
amino acids from the cleavable linker. Examples include, but are not limited to, pancreatic 
carboxypeptidase B to remove basic residues from the carboxyl terminus, soluble yeast 
carboxypeptidase-ysc-alpha (EP467839), carboxypeptidase E or N (WO 920621 1 (in Japanese)), 
aminopeptidase A (WO 9206211 (in Japanese)), Lactococcus lactis dipeptidyl peptidase IV 
which removes prolyl-prolyl extensions from the amino terminus, X-prolyl dipeptidyl 
aminopeptidase from Aspergillus oryzae or Lactococcus lactis (Tachi et al„ Phytochemistry 
31(1 1):3707 (1992); Yosphe-Besancon, Biotechnol. Appl. Biochem. 20(1):131 (1994)). 

The desired polypeptides may also be additionally purified or isolated if necessary, in a variety of 
ways known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse-phase 
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HPLC chromatography, and chromatofocusing. Ultrafiltration and diafiltration techniques, in 
conjunction with protein concentration, are also useful. For general guidance in suitable 
purification techniques, see Scopes, R., Protein Purification, Springer-Verlag, NY (1982). 
Purification tags or sequences may also be used, as is known in the art. The degree of 
purification necessary will vary depending on the use of the desired polypeptide. In some 
instances no purification will be necessary. 

The desired polypeptides may also be chemically modified, formulated for use or administration 
as needed, as will be appreciated by those in the art. 

The following examples serve to more fully describe the manner of using the above-described 
invention, as well as to set forth the best modes contemplated for carrying out various aspects of 
the invention. It is understood that these examples in no way serve to limit the true scope of this 
invention, but rather are presented for illustrative purposes. The references cited herein are 
incorporated by reference. 

EXAMPLES 
General methods 
Methods for Aspergillus transformation 
The expression plasmids designed for use with Aspergillus all contained the ,4. nigerpyrG gene 
as selectable marker for transformation. This gene encodes orotidine 5'-monophosphate 
decarboxylase, an enzyme from the uridine biosynthetic pathway. The Aspergillus strains used 
were all pyrG mutant strains which consequently required uridine for growth. Selection for 
transformants involved growth in the absence of exogenous uridine. Purified plasmid DNA was 
digested with HindUl prior to transformation into Aspergillus. 

Two different strains, GCGAP3-4 and dgr246 P2, were used in this work, both of which were 
derived from strain UVK143f (US patent 5,364,770), a glucoamylase over-producing strain itself 
derived from strain NRRL3 112. ^4. niger var awamori strain GCGAP3-4 has had the genes 
encoding the major secreted aspartic proteinase (pepA) and glucoamylase (glaA) deleted. 
Deletion of the glaA gene was achieved by transformation of strain GC12 (genotype pyrGS; 
argB3; described by Berka et a!., 1991, in: Applications of Enzyme Biotechnology, Eds. Kelly, 
J.W. and Baldwin, T.O., Plenum Press, New York) with a linear DNA fragment having flanking 
regions of the glaA locus at either end with the glaA promoter and part of the coding region 
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(from 245 bp 5' of the initiation codon to a position corresponding to codon 200 within the 
coding region) replaced by the pyrG gene from Aspergillus nidulans. One of the transformants in 
which this linear fragment had integrated at the glaA locus, inactivating the glaA gene, was called 
strain GCAGAM64. ThepepA gene was deleted from strain GCAGAM64 by the same method as 
described by Berka et al. (1990, Gene 86:153-162). This method involved transformation with a 
linear fragment of DNA comprising flanking regions of the pepA locus with the entire coding 
region (from 178 bp 5' of the initiation codon to approximately 900 bp 3' of the termination 
codon) replaced by the argB gene of A. nidulans. One of the transformants in which this DNA 
fragment had integrated at the pepA locus and inactivated the pepA gene was designated strain 
GCGAP3. A uridine requiring, pyrG deficient, derivative of strain GCGAP3 was selected by 
resistance to fluoroorotic acid (van Hartingsveldte et al. 1987, Mol. Gen. Genet. 206:71-75) and 
called strain GCGAP3-4. It was observed by Southern analysis that a spontaneous deletion of 
DNA had occurred at the glucoamylase locus of this strain to remove the A. nidulans pyrG gene 
which had integrated at that site as well as some of the flanking regions of the glaA locus. 

Strain dgr246 P2 has had the pep A gene deleted and is pyrG minus. In addition, strain dgr246 P2 
has undergone several rounds of mutagenesis and screening or selection for improved production 
of a heterologous gene product. Both strains are described by Ward, M. et al. (1993) Ajfil 
Microbiol Biotech . 39:738-743. 

The transformation protocol is a modification of the Campbell method (Cambell et al. (1 989). 
Curr. Genet. 16:53-56) . All solutions and media were either autoclaved or filter sterilized 
through a 0.2 micron filter. Spores of A. niger var awamori were harvested from complex media 
agar (CMA) plates. CMA contained 20 g/1 dextrose, 20 g/1 DifcoBrand malt extract, 1 g/l Bacto 
Peptone, 20 g/1 Bacto agar, 20 ml/1 of 100 mg/ml arginine, 20 ml/1 of 100 mg/ml uridine, 1 ml/1 
of -10 mg/ml PABA, 10 ml/1 of met/bio solution, and 1 ml/1 of 50 mg/ml streptomycin. Met/bio 
solution was composed of 50 g of L-methionine, 200 mg of D-biotin and distilled water up to 1 
liter. An agar plug of approximately 1.5 cm square of spores were used to inoculate 100 mis of 
Yeast Extract Broth with glucose (YEG; 5 g/1 yeast extract, 20 g/1 glucose, 20 ml/1 100 mg/ml 
arginine, 20 ml/1 100 mlg/ml uridine, 1 ml/1 50 mg/ml stretomycin). The flask was incubated at 
37°C on a shaker at 250-275 rpm, overnight. The mycelia were harvested through sterile 
Miracloth (Calbiochem, San Diego, CA, USA) and washed with 200 mis of Solution A (0.8M 
MgS0 4 inlO mM sodium phosphate, pH 5.8). The washed mycelia were placed in a sterile 
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solution of 100 mg of Novozym234 (Novo Nordisk Industri A/S, Copenhagen, Denmark) in 20 
mis of solution A. This was incubated at 28'C at 200 rpm for 1 hour in a sterile 250 ml plastic 
bottle (Corning Inc, Corning, New York). After incubation, the protoplasting solution was 
filtered through sterile Miracloth into a sterile 50 ml conical tube (Sarstedt, USA). The resulting 
liquid containing protoplasts was divided equally amongst 4 - 50 ml conical tubes. Forty mis of 
solution B (1.2 M sorbitol, 50 mM CaCl 2 , 10 mM Tris, pH7.5) were added to each tube and 
centrifuged in a table top clinical centrifuge (Damon IEC HN SH centrifuge) at 3/4 speed for 10 
minutes. The supernatant from each tube was discarded and 20 mis of fresh solution B was 
added to one tube, mixed, then poured into the next tube until all the pellets were resuspended. 
The tube was then centrifuged at 3/4 speed for 10 minutes. The supernatant was discarded, 20 
mis of fresh solution B was added, the tube was centrifuged for 10 minutes at 3/4 speed. The 
wash occurred one last time before resuspending the washed protoplasts in solution B at a 
density of 0.5-1.0 X 107 protoplasts/ lOOul. To each 100 ul of protoplasts in a sterile 15 ml 
conical (Sarstedt, USA), 10 ul of the transforming plasmid DNA was added. To this, 12.5 ul of 
solution C (50% PEG 4000, 50 mM CaCl 2 , 10 mM Tris, pH 7.5) was added and the tube was 
placed on ice for 20 minutes. One ml of solution C was added and the tube was removed from 
the ice to room temperature and shaken gently. Two mis of solution B was added immediately to 
dilute solution C. The transforming mix was added equally to 3 tubes of melted MMS overlay (6 
g/1 NaNO,, 0.52 g/l KCI, 1.52 g/1 KH 2 P0 4 , 218.5 g/1 D-sorbitol, 1.0 ml/1 trace elements-LW, 10 
g/1 SeaPlaque agarose (FMC Byproducts, Rockland, Maine, USA) 20 ml/1 50% glucose, 2.5 
ml/1 20% MgS04*7H20, 10 ml/1 met/bio solution, 2.0 ml/1 10 mg/ml PABA pH to 6.5 with 
NaOH) that were stored in a 37°C water bath. Trace elements LW consisted of 1 gA 
FeS0 4 .7H 2 0, 8.8 g/1 ZnS0 4 .7H 2 0, 0.4 g/1 CuS0 4 .5H 2 0, 0.15 g/1 MnS0 4 .4H 2 0, 0.1 g 
N a2 B 4 0 ? . 1 0H 2 O, 50 mg/1 (NH^O^HA 250 mis H 2 0, 200 ul/1 concentrated HC1. The 
melted overlays with the transformation mix were immediately plated onto 3 MMS plates (same 
as.MMS overlay recipe with the exception of 20 g/1 of Bacto agar instead of 10 g/1 of SeaPlaque) 
that had been supplemented with 200 ul/plate of 100 mg/ml of arginine added directly on top of 
the agar plate. After the agar solidified, the plates were incubated at 37°C until transformants 
grew. 

The sporulating transformants were picked off with a sterile toothpick onto a plate of minimal 
media* glucose (MM). MM consisted of 6 g/lNaN0 3 , 0.52 g/1 KCI, 1.52 g/1 KH 2 P0 4 , 1 ml/1 
Trace elements LW, 20 g/1 Bacto agar, pH to 6.5 with NaOH, 25 ml/1 of 40 % glucose, 2.5 ml/1 
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of 20% MgS0 4 .7H 2 0, 1 ml/1 of 50 mg/ml stretomycin/10 mg/ml carbenicillin, 20 ml/1 of 100 
mg/ml arginine. Once the transformants grew on MM they were transferred to CMA plates. 

Each transformant was given a designation which included the host strain, either GAP3-4 
(=GCGAP3-4) or dgr246 (=dgr246 P2), the plasmid used for transformation, e.g. GAKHiH 
(=pGAKMH+) and a number to distinguish individual transformants. As an example, 
transformant #3 obtained from strain GCGAP3-4 with plasmid pGAKHi+ would be designated 
GAP3-4GAKHiH+ #9. 

Methods for culture of Aspergillus transformants in shake flasks 
A 1.5 cm square agar plug of each strain was added to 50 mis, in a 250 ml shake flask, of an 
inoculum medium called CSL + fructose (100 g/1 corn steep liquor ( 50 % solids, National), 1 g/1 
NaHJ>0 4 .H 2 0, 0.5 g/1 MgS0 4 , 100 g/1 maltose, 10 g/1 glucose, 50 g/1 fructose, 3 ml/1 Mazu 
DF60-P (Mazur Chemicals, Gurnee, IL, USA), pH to 5.8 with NaOH, 1 ml/1 of 50 mg/ml 
streptomycin/10 mg/ml carbenicillin. Flasks were incubated at 37°C, 200 rpm, for 2 days. Five 
mis of the 2 day old medium were inoculated into 50 ml of one of two different possible 
production media; called either Clofine special or Sheftone N. Clofine special had the following 
components: 70 g/1 sodium sitrate, 15 g/1 (NH^ SO*, 1 g/1 NaH 2 P04.H 2 0, 1 g/1 MgS0 4) 1 ml 
Tween 80, pH to 6.2 with NaOH, 2 ml/1 Mazu DF60-P, 60 g/1 spray dried tofu TF30 soy milk 
powder (Armour Good Ingredients, Springfield, Kentucky, USA), 120 g/1 maltose, 20 ml/1 of 
100 mg/ml arginine. Sheftone N medium had the following components: 100 g/1 Sheftone N 
(Sheffield Products, Norwich, NY, USA); 13.6 g/1 (NH^SO*; 0.8 g/1 MgS0 4 ; 1 g/1 KC1; lg/1 
KH 2 P0 4 ; 120 g/1 sodium citrate; 1 ml Tween 80; 13 ml/1 Mazu DF60-P; 10 g/1 glucose; 100 g/1 
maltose, 20 ml/1 of 100 mg/ml arginine. The production media flasks were incubated at 37°C, 
200 rpm for up to 5 days and supernatant samples were taken for analysis at various times 
throughout. 

Detection of glucoamylase-polypeptide fusion proteins 
produced by Aspergillus transformants 

Five ocl samples of culture supernatant were mixed with 15 oct! of running buffer and 5 ocl of 
tracking dye and subjected to sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis 
using precast gels according to the manufacturers instructions (Daiichi or Novex). The gels were 
either stained for protein with Coomassie stain or the protein was transferred to nitrocellulose 
filters by Western blotting (Towbin et al., 1979, Proc. Natl. Acad. Sci. USA 76:4350-4354). 



WO 98/31821 PCT/US98/00474 

- 25 - 

Glucoamylase was visualized on Western blots by sequential treatment with rabbit anti- 
glucoamylase antibody and goat anti-rabbit IgG conjugated with horse radish peroxidase (HRP) 
followed by HRP color development by incubation with H 2 0 2 and 4-chloro-l-napthol. The 
presence of hemagluttinin (HA) epitope on glucoamylase-HA epitope fusion proteins transferred 
to Western blots was detected by sequential treatment with a monoclonal mouse anti-HA 
antibody (clone 12CA5, Boehringer Mannheim, Indianapolis, IN, USA) and goat anti-mouse IgG 
conjugated with horse radish peroxidase (HRP) followed by HRP color development by 
incubation with H 2 0 2 and 4-chloro- 1 -napthol. 

As an initial screen, transformants were cultured in Clofine special production medium for 4 or 5 
days. Under these conditions, the glucoamylase-polypeptide fusion proteins appeared on 
Coomassie stained gels or on Western blots as lower in molecular weight than full-length 
glucoamylase and, thus, could be distinguished from native glucoamylase. This was the case even 
if the predicted molecular weight of the glucoamylase-polypeptide fusion protein was similar to 
glucoamylase (e.g., the expected product from pGA(KHj) 5 + transformants described in Example 
3. The lower than predicted molecular weight of some of the glucoamylase-hirulog fusion 
proteins was presumed to be due to proteolysis. Proteolysis of glucoamylase-polypeptide fusion 
proteins in liquid culture was minimized by using Sheftone N production medium, by production 
in strain dgr246 P2 as compared to strain GCGAP3-4, and by examining glucoamylase- 
polypeptide products at early time points during culture (there was less effect of proteolysis at 
earlier time points, i.e., before 5 days of culture, although yields were lower). 

Using the Western blot method it was possible to identify those transformants of GAP3-4 which 
produced glucoamylase-polypeptide fusion proteins since the untransformed parent strain 
produced no native glucoamylase. The polyclonal anti-glucoamylase antibody had much greater 
affinity for full-length glucoamylase bearing the starch binding domain than it did for forms of 
glucoamylase without the starch binding domain. As a result, it was also possible to identify 
those transformants of strain dgr246 P2 in which the expression vector had integrated at the glaA 
locus since these transformants made no full-length glucoamylase and only showed the weaker 
staining on Western analysis associated with glucoamylase-hirulog fusion proteins. However, in 
the majority of transformants the expression vector integrated at unknown sites in the genome 
rather than at the glaA locus. All the transformants described in greater detail below were of this 
type. 
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It was possible to identify the Coomassie-stained protein band corresponding to the 
glucoamylase-polypeptide fusion protein in transformants of GCGAP3-4 or dgr 246 P2 obtained 
with any of the expression vectors by comparison with supernatant from the untransformed 
control strains. Those transformants which produced the greatest amount of glucoamylase- 
hirulog as judged by the protein gels or Western blots were identified and used for further 
studies. 

Polypeptides were purified by reverse phase HPLC and sequence analysis and mass spectrometer 
analysis performed as follows. Samples were applied to a Vydec 218TP54 reverse phase HPLC 
column, using a Hewlett Packard 1090M HPLC system. The mobile phase was 0.1% TFA 
(trifluoroacetic acid) and the organic phase was 0.08% TFA in acetonitrile. The gradient was run 
to 60% acetonitrile at 0.5%/min after a 5 minute initial wash. Peaks of interest were collected by 
hand at the appropriate times and submitted for sequence analysis and mass spectrometer 
analysis. 

Tryptic digestion of polypeptides was performed as follows. To 200 micrograms of protein, IN 
HC1 was added to a final concentration of 0.1N and the sample was incubated on ice for 10 
minutes. The denatured protein was precipitated by the addition of 9 volumes of 100% acetone 
and centrifugation at 13,000xg. The supernatant was removed and the pellet washed with Sf" 
microliters of 80% acetone. Forty microliters of 8M urea in 1M Tris (pH 8.53 with TFA) was 
added to dissolve the pellet, followed by 160 microliters of deionized water and 8 microliters of 1 
mg/ml trypsin (Worthington Labs) in 0. 1M Hcl. The sample was incubated at 37'C for 2 hours. 
Upon completion, the reaction was terminated by the addition of 10% TFA to a final 
concentration of 1%. Sequence analysis of peptides was performed on an ABI automated 
sequencer using the Edman degradation chemistry. 

For mass spectrometry analysis 50 microliters of polypeptide collected from HPLC was mixed 
with acetic acid to give a final acetic acid concentration of 1%. This sample was introduced to a 
Hewlett Packard model 55987 Electrospray Mass Spectrrometer through a 50 microliter loop 
attached to a Valco manual injection valve. The pump providing flow was a Hewlett Packard 
1090M HPLC running at 0.05 ml/min. Mass data was collected and analyzed using Hewlett' 
Packard ChemStation Data Analysis Software. 
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Example 1 

Assembly of single and tandem repeats of R3-hirulog coding sequences 

We have tested the production of fusion proteins consisting of Aspergillus glucoamylase, or 
portions thereof, coupled to single copies or tandem repeats of the peptide R3-hirulog. This 
peptide consists of 18 amino acids with the following sequence: RPGGGGNGDFEEIPEEYL. 
Between glucoamylase and the peptide and between individual peptide units in a tandem array 
either a lysine (K) or lysine + arginine (KR) were positioned. In some cases we positioned a 
lysine followed by five histidine residues at the carboxyl end of the fusion protein. 
Oligonucleotides were synthesized and assembled in order to encode the following combinations 

» 

of elements where K=lysine, R=arginine, H=histidine, X=R3-hirulog. 
K-X 

K-X-K(H)s 

K-X-K-X-K-X-K-X-K-X 
KR-X-KR-X-KR-X-KR-X-KR-X 

The rationale for proteolytic cleavage of these polypeptides to release monomelic forms of R3- 
hirulog is as follows. Trypsin will cleave on the carboxyl side of basic residues (i.e., either Arg 
or Lys) and KEX2 proteinase will cleave on the carboxyl side of dibasic residues (i.e., Lys-Arg). 
However, it is expected that, in common with other serine proteinases, neither trypsin nor KEX2 
proteinase will cleave the polypeptide if the amino acid on the carboxyl side of the cleavage site 
is a proline. Endoproteinase-Lys-C would be expected to cleave on the carboxyl side of lysine 
residues. The polypeptides above which have the Lys-Arg sequence of amino acids before or 
between R3-hirulog monomers would be expected to be cleaved by Aspergillus KEX2 
proteinase within the secretory apparatus of Aspergillus. Those polypeptides with only a Lys 
residue separating glucoamylase and the R3-hirulog monomers would not be expected to be 
cleaved by KEX2 proteinase but could be cleaved in vitro with either trypsin or endoproteinase- 
Lys-C after the Lys residues. In all cases mentioned above, the product of cleavage would be R3 
hirulog with either one basic residue (Lys) or two basic residues (Lys-Arg) at the carboxyl 
terminus. These basic residues could be removed from R3-hirulog by the action of, for example, 
carboxypeptidase B, which will remove basic residues from the carboxyl terminus of peptides. 

The double-stranded synthetic DNA molecules encoding the above sequences were designed 
either to have a single-strand overhang compatible for ligation with Nhel restricted DNA at the 



PCT/US98/00474 

WO 98/31821 

- 28 - 

5' end and a single-strand overhang compatible with BsiEU restricted DNA at the 3' end or 
single-strand overhangs compatible with Nhel restricted DNA at both ends. 

Several pairs of complementary oligonucleotides were synthesized with the following sequences. 

Oligonucleotide pair 1 encodes an Nhel compatible overhang, a lysine, the first 13 amino acids 
of R3-hirulog, and has an Xmal compatible overhang at the 3' end. 

HirK 1 - A - 5 ' -CT AGC AAGCGCCCCGGCGGCGGCGGC AACGGCGACTTCG AGGAGATC 
HirKl-B: 5'-CCGGGATCTCCTCGAAGTCGCCGTTGCCGCCGCCGCCGGGGCGCTTG 

Oligonucleotide pair 2 encodes an Mel compatible overhang, lysine and arginine, the first 13 
amino acids of R3-hirulog, and has animal compatible overhang at the 3' end. 
HirKRl-A". 5V 

CTAGCAAGCGCCGCCCCGGCGGCGGCGGCAACGGCGACTTCGAGGAGATC 
HirKRl-B" 5'- 

CCGGGATCTCCTCGAAGTCGCCG1TGCCGCCGCCGCCGGGGCGGCGCTTG 

* 

Oligonucleotide pair 3 encodes the last 5 amino acids of R3-hirulog, a lysine residue, and the first 
13 amino acids of R3-hirulog. It has BspEl or Xmal compatible overhangs at both ends. 
Hir2-A: 

CCGGAGGAGTACCTGAAGCGCCCCGGCGGCGGCGGCAACGGCGACTTCGAGGAGAT 
C 

Kr2-B: 

CCGGGATCTCCTCGAAGTCGCCGTTGCCGCCGCCGCCGGGGCGCTTCAGGTACTCCT 

Oligonucleotide pair 4 encodes the last 5 amino acids of R3-hirulog, a stop codon, aMH 
recognition site and has an Xmal compatible overhang at the 5' end and an EcoBl compare 

5 overhang at the 3' end. 

Kr3-A: 5'-CCGGAGGAGTACCTGTAGGTGACCG 
Hir3-B: S'-AATTCGGTCACCTACAGGTACTCCT 
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Oligonucleotide pair 5 encodes the last 5 amino acids of R3-hirulog, a lysine residue, five 
histidine residues, a stop codon, a BslEIl recognition site and has an Xmal compatible overhang 
at the 5' end and an EcdRI compatible overhang at the 3' end. 

Hir3-5H-A: 5'-CCGGAGGAGTACCTGAAGCACCACCACCACCACTAGGTGACCG 
Hir3-5H-B: 5'-AATTCGGTCACCTAGTGGTGGTGGTGGTGCTTCAGGTACTCCT 

* 

* 

Each oligonucleotide pair was annealed and the DNA ends phosphorylated using polynucleotide 
kinase. The resulting double-stranded DNA molecules were ligated with pSLl 1 80 which had 
been digested with appropriate restriction endonucleases (see below) and dephosphorylated using 
alkaline phosphatase. 

Oligonucleotide pair 1 was ligated with pSLl 180 which had been cut with Nhel and Xmal to 
give plasmid pSLHirKl. 

Oligonucleotide pair 2 was ligated with pSLl 180 which had been cut with Nhel and Xmal to 
give plasmid pSLHirKRl 

Oligonucleotide pair 3 was ligated with pSLl 180 which had been cut with BspEl and Xmal to 
give plasmid pSLHirK2. 

Oligonucleotide pair 4 was ligated with pSLl 1809 which had been cut with BspEl and EcdRI to 
give plasmid pSLHir3. 

Oligonucleotide pair 5 was ligated with pSLl 1 809 which had been cut with BspEl and EcoRl to 
give plasmid pSLHir3-5H. 

Fig. 2 shows diagrams of the above five plasmids containing synthetic DNA and shows the 
important restriction sites and the amino acids encoded by the oligonucleotides. In each case 
DNA sequence analysis confirmed that the sequence of the oligonucleotides was as expected and 
that, in the case of pSLHirK2, that the oligonucleotides were inserted in the desired orientation 
in the multiple cloning site of pSLl 180. 
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In order to create a plasmid containing DNA encoding a complete R3-hirulog peptide preceeded 
by a lysine residue pSLHirKl was digested with Xmal and £coRI, the large DNA fragment was 
isolated and was dephosphorylated using alkaline phosphatase. This DNA was ligated with 
oligonucleotide pair 4 which had been purified by agarose gel electrophoresis after digestion of 
P SLHir3 with BspEl and EcoBI. The resulting plasmid was named pA3 (Fig.3). 

In order to create a plasmid containing DNA encoding a complete R3-hirulog peptide preceeded 
by a lysine residue and with a lysine and five histidine residues at the carboxyl terminus 
pSLHirKl was digested with Xmal and EcoRl, the large DNA fragment was isolated and was 
dephosphorylated using alkaline phosphatase. This DNA was ligated with oligonucleotide pair 5 
which had been purified by agarose gel electrophoresis after digestion of P SLHir3-5H with 
BspEL and EcdRl. The resulting plasmid was named pA2 (Fig.3). 

Creation of tandem repeats of the R3-hirulog encoding sequence was performed as follows. 
Plasmid P SLHirK2 was digested with BspEl and Drain, the DNA ends were dephosphorylated 
with alkaline phosphatase and the larger DNA fragment was isolated from an agarose gel. The 
same plasmid, pSLHirK2, was also cut with Xmal and Drain and the smaller of the two DNA 
fragments was purified from an agarose gel. These two DNA fragments were ligated to create 
the plasmid P SLHirK2X2 (Fig.4). This plasmid contained two tandem copies of oligonucleotide 
pair 3 and thus encoded the last 5 residues of R3-hirulog, a lysine residue, a complete R3-hirulog 
peptide, a lysine residue, and the first 13 residues of R3-hirulog. 

Plasmid P SLHirK2X2 was digested with BspEL and Drain, the DNA ends were 
dephosphorylated with alkaline phosphatase and the larger DNA fragment was isolated from an 
agarose gel. The same plasmid, P SLHirK2X2, was also cut with Xmal and DraUl and the 
smaller of the two DNA fragments was purified from an agarose gel. These two DNA fragments 
were ligated to create the plasmid P SLHirK2X4 (Fig.4). This plasmid contained four tandem 
copies of oligonucleotide pair 3 and thus encoded the last 5 residues of R3-hirulog, a lysine 
residue, three complete R3-hirulog peptides separated by lysine residues, a lysine residue, and the 
first 13 residues of R3-hirulog. 

In order to create a plasmid containing DNA encoding five copies of R3-hirulog preceeded and 
separated by lysine residues the following strategy was employed. Plasmid pSLHirKl was 
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digested with Xmal and EcoRl, the DNA ends were dephophorylated and the large DNA 
fragment was purified after agarose gel electrophoresis. Plasmid pKrK2X4 was digested with 
BspEI and EcoRl and the smaller DNA fragment was purified. These two fragments were ligated 
to form pKl,K2X4 (Fig.5). This plasmid, pKl,K2X4 was then digested with Xmal and EcoRl, 
the DNA was dephosphorylated, and the large fragment was purified. Plasmid pSLHir3 was 
digested with BspEI and EcoRl and the small DNA fragment was purified. Ligation of these two 
fragments created pKl,K2X4,3 (Fig.5). 

* *■ ■ 

In order to create a plasmid containing DNA encoding five copies of R3-hirulog separated by 
lysine, preeceeded by lysine and arginine, and with a lysine and five histidine residues at the 
carboxyl end the following strategy was employed. Plasmid pSLHirKRl was digested with Xmal 
and EcoRl, the DNA ends were dephophorylated and the large DNA fragment was purified after 
agarose gel electrophoresis. Plasmid pHirK2X4 was digested with BspEI and EcoRl and the 
smaller DNA fragment was purified. These two fragments were ligated to form pKRl,K2X4 
(Fig.6). This plasmid, pKRl,K2X4 was then digested with Xmal and £coRI, the DNA was 
dephosphorylated, and the large fragment was purified. Plasmid pSLHir3-5H was digested with 
BspEI and EcoRl and the small DNA fragment was purified. Ligation of these two fragments 
created pKRl,K2X4,3H (Fig.6). 

In the above examples R3-hirulog encoding peptides were assembled in a manner designed to 
allow their in-frame fusion at the 3* end of the glucoamylase coding region and thus included a 
translation stop codon near the 3' end of the synthetic DNA. It was also necessary to designed 
R3-hirulog encoding oligonucleotides which could be inserted at an internal position within the 
glucoamylase coding region without disrupting the reading frame. For this purpose it was desired 
that the oligonucleotides would encode R3-hirulog proceeded, separated by and followed by 
lysine residues as shown below where K=lysine, X=R3-hirulog. 

K-X-K 

The following pair of oligonucleotieds was synthesized. 

Oligonucleotide pair 6 encodes the last 5 amino acids of R3-hirulog followed by a lysine residue, 
has an Xmal compatible overhang at the 5' end and an Nhel compatible overhang at the 3 ' end. 
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Hir3K-A: 5 ' -CCGGAGGAGT ACCTGAAGGCTAGCG 
Hir3K-B : 5 ' - AATTCGCTAGCCTTG AGGT ACTCCT 

In order to create a plasmid containing DNA encoding a complete R3-hirulog peptide proceeded 
and followed by a lysine residue plasmid pSLHirKl was digested with Xma\ and EcdBl, the 
DNA ends were dephosphorylated using alkaline phosphatase and the large DNA fragment was 
purified from an agarose gel. This DNA fragment was ligated with annealed oligonucleotide pair 
6, the 5* ends of which had been phosphorylated using polynucleotide kinase, to create plasmid 
pKl,3K (Fig.7). 

Example 2 

Design of oligonucleotides encoding HA epitope 
Oligonucleotides were designed to encode a nine amino acid epitope of hemagluttinin (HA) with 
the following sequence: YPYDVPDYA The design of the oligonucleotides ultimately allowed 
this amino acid sequence to be fused at the amino terminus of glucoamylase (followed by a 
methionine residue) or to be inserted into the linker region of glucoamylase (proceeded by a 
lysine residue and followed by a methionine residue). 

Two pairs of complementary oligonucleotides were synthesized with the following sequences. 

Oligonucleotide pair 7 encodes an Nhel compatible overhang, a lysine residue, the 9 amino acids 
of HA epitope, a methionine residue, and has an Nhel compatible overhang at the 3' end. 
K-EPI-M-A: 5'-CTAGCAAGTACCCCTACGATGTGCCCGACTACGCTATGG 
K-EPI-M-B: 5'-CTAGCCATAGCGTAGTCGGGCACATCGTAGGGGTACTTG 

Oligonucleotide pair 8 encodes aBssHll compatible overhang, alanine, lysine and arginine 
residues, the 9 amino acids of HA epitope, a methionine residue, and has a itoHD compatible 
overhang at the 3' end. 

NEPI- A 5 '-CGCGCTAAGCGCTACCCCTACGATGTGCCCGACTACGCTATG. 
NEPI-B: 5 ' -CGCGC ATAGCGTAGTCGGGC AC ATCGT AGGGGT AGCGCTT AG. 

Oligonucleotide pair 7 was ligated with pSLl 180 which had been cut with Mel to give plasmid 
pEpiN (Fig. 8). 
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Oligonucleotide pair 8 was ligated with pLitmus29 (New England Biolabs, Beverly, MA, USA) 
which had been cut with BssHH to give plasmid pLit29+Nepi (Fig. 8). 

Example 3 

Construction of the pGA+ plasmid series, a series of expression vectors containing 
polypeptide-encoding sequences fused to the linker region of glucoamylase 

A series of expression plasmids was constructed, using methods known in the art, designed to 
produce fusion proteins consisting of desired polypeptides fused to the C-terminus of the 
catalytic core and part of the linker region of Aspergillus glucoamylase. 

The pGA+ series of plasmids were constructed based on the bacterial plasmid pSLl 180 
(Pharmacia LKB). This plasmid is similar to the pUC series of phagemids such as pUCl 18 but 
has an extended multiple cloning site (MCS) containing all possible six base pair palindromes 
including two recognition sites for Clal, one of which is methylated and the other not methylated 
by R coli dam methylase. An expression cassette was inserted into the MCS of pSLl 180 so that 
the first 28 bp of the MCS from HindUl to Spel was present at one end. Beginning at the Spel 
recognition site the expression cassette contains the following components: the promoter and 5* 
flanking region of the glaA gene of A. niger van awamori strain UVK143f (US patent 
5,364,770) starting at a Spel site approximately 1 kb upstream from the translation start codon; 
the coding region of the glaA gene from strain UVK 143f from the start codon to an Nhel site, 
the six bp of which encode amino acids 497 and 498 in the linker region of mature glucoamylase; 
a synthetic piece of double stranded DNA encoding the desired peptide with Nhel and BsiEU 
ends; approximately 250 bp of the terminator region of the glaA gene from A. niger strain #7 
(US patent 5,364,770) beginning at a BslEU site 20 bp 5' of the translation stop codon and 
ending at an EcdRV site; an approximately 2.4 kb fragment containing the A. niger pyrG gene 
(Wilson et at, 1988) obtained as a Smal to Seal fragment from pBH2 (Ward et al., 1989) ( the 
Smal site exists within the pUC18 MCS of pBH2 and the Seal site exists 65 bp from the end of 
the BamHl-Hindm fragment of A niger DNA present in pBH2); approximately 1.3 kb of 3' 
flanking DNA from the glaA gene of A. niger ; beginning at the EcoRV site 250 bp 3 ' of the 
translation stop codon and ending at a Clal site. After the Clal site there exists 75 bp of the 
pSLl 180 MCS from Clal (the site not methylated by E. coli dam methylase) to HindUl, 
followed by 6 bp of pBR322 from HindUl to Clal, followed by 198 bp of the pSLl 180 MCS 
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from CM (the site methylated by E. coli dam methylase) to EcdBl. 

As shown in Fig. 9 these expression vectors (pGA+ series) contain unique Nhel and BstEU sites 
which can be utilized to insert a DNA fragment encoding any desired peptide such that a fusion 
, protein (comprising the glucoamylase catalytic core, part of the glucoamylase linker region and 
the desired peptide) would be expected to be produced after transformation into Aspergillus 
strains. The pyrG gene is included as a selectable marker for transformation into Aspergillus 
strains which are pyrG null mutants. 

„ The Nhel - BsiETl pieces of synthetic DNA encoding R3-hirulog peptide or tandem repeats 
thereof were isolated from plasmids P A2; P A3; P K1,K2X4,3; and P KR1,K2X4,3H and were 
inserted into the P GA+ series plasmid at the unique Nhel and BstETl sites to form pGAKHiH+, 
pGAKH5+, pGA(KHi)5+, and pGAKR(KHi 5 )H+ respectively (Fig. 9). 

15 For control experiments an expression plasmid designed to produce in Aspergillus a truncated 
version of glucoamylase (residues 1-498 of mature glucoamylase with a lysine residue at the C- 
terminus but with no attached polypeptide) was constructed by the following steps. A pair of 
oligonucleotides with the following sequences were synthesized. 

20 GAA3'-A: 5'-CTAGCAAGTAG. 
GAA3'-B. 5'-GTCACCTACTTG. 

The above oligonucleotide pair was annealed, the DNA ends were phosphorylated with 
polynucleotide kinase and it was inserted into the P GA+ series plasmid at the unique Nhel and 
25 BstBU sites to form pGA_3 * . 

Plasmids pGAKHiH+, P GAKHi + , P GA(KHi) s+ , pGAKR(KHij)H+ and P GA_3' were 
transformed into strains GCGAP3-4 or dgr246 P2 and the transformants were sereened for 
glucoamylase production as described in the General Methods section. The following 
so transformants were selected for further study. 



GAP3-4GAKIBH+ #9 

GAP3-4GA(KHi)5+ #10 
GAP3-4GAKR(KHi)jH+ #7 

j, dgr246GAKHiH+ #3 
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dgr246GA(KHi) 3 +#ll 
dgr246GAKR(KHi,)H+ #14 
GAP3-4GAJ' #2 

With appropriate transformants, it was possible to confirm that a single copy or five tandem 
repeats of R3-hirulog remained attached to glucoamylase in supernatant samples by 
demonstration of the presence of a histidine tail at the carboxyl terminus or by determination of 
the size of the glucoamylase-hirulog fusion protein produced. The results are summarized below. 
Two forms of glucoamylase, GAI and GAII, are naturally present in the supernatant of 
Aspergillus niger cultures. On SDS-PAGE these forms have apparent molecular weights of 
approximately 69 kDa and 48 kDa. The molecular weight of the truncated form of glucoamylase 
produced by transformant GAP3-4GAJT #2 had an apparent molecular weight of 47 kDa, as 
estimated by gel elctrophoresis. 

Transformants GAP3 -4GAKHiH+ #9 and dgr246GAKHiH+ #3 were cultured in Sheftone N 
medium for 2 to 3 days. A single additional protein band, compared to untransformed control 
strains, was observed following electrophoretic separation of the secreted proteins. This protein 
was of a similar size (approximately 48 kDa) as the glucoamylase protein produced by 
transformant GAP3-4GA_3' #2. It was possible to purify some of the 48 kDa protein produced 
by transformants GAP3-4GAKHiH+ #9 and dgr246GAKHiH+ #3 using the following approach. 
Supernatant samples were desalted using PD-10 Columns (Pharmacia Biotech, Piscataway, NJ, 
USA). A buffer comprising 50mM sodium phosphate, pH8.0, 300 mM NaCl, 20 mM imidazole, 
1 mMPMSF was used to equilibrate these columns and to elute the protein samples. Qiagen Ni- 
NTA spin columns (Qiagen, Chatsworth, CA, USA) were then used according to the 
manufacturers instructions to purify any proteins with a polyhistidine tail. Protein samples 
purified in this manner were concentrated using Microcon 3 concentrators (Amicon). A single 48 
kE)a protein was purified from culture supernatants of both of these transformants. However, it 
was not possible to purify a protein of any size from GAP3-4GAJT #2 using the Ni-NTA spin 
columns. These experiments demonstrated that the desired glucoamylase-hirulog fusion protein 
(i.e., glucoamylase catalytic core and linker region with one copy of R3-hirulog and five histidine 
residues at the carboxyl terminus) is produced by transformants GAP3-4GAKHiH+ #9 and 
dgr246GAKHH+ #3. 
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Transformants GAP3-4GAKR(KHis)H+ #7 and dgr246GAKR(KHi 5 )H+ #14 were cultured in 
Sheftone N medium for 2 to 3 days. The expected glucoamylase-hirulog product of these 
transformants consisted of glucoamylase catalytic core and linker region, followed by a KEX2 
cleavage site (Lys-Arg), five tandem repeats of R3-hirulog (each separated by a lysine residue), 
and ending with a lysine and five histidine residues at the carboxyl terminus. Cleavage by KEX2 
within the secretory apparatus of Aspergillus was expected to cleave this fusion protein to 
release a truncated form of glucoamylase (i.e., the catalytic core and linker region; being the 
same size as the glucoamylase produced by GAP3-4GAJT #2) and a 100 amino acid fragment 
containing the five tandem repeats of R3-hirulog and the terminal five histidine residues. Three 
extra protein bands with apparent molecular weights of approximately 58 kDa, 44 kDa, and 10.8 
kDa which were not observed in samples from untransformed control strains, were observed in 
supernatant samples from G AP3 -4 G AKR(KHi 3 )H+ #7 and dgr246GAKR(KHis)H+ #14 after 
SDS-PAGE and Coomassie staining. We concluded that only a proportion of the glucoamylase- 
(hirulog), fusion protein was cleaved by KEX2 so that a full-length form of glucoamylase- 
(hirulog) 5 the expected truncated form of glucoamylase and a polypeptide containing five repeats 
of R3-hirulog were present. Purification of proteins bearing a polyhistidine tail from culture 
supernatant of these transformants was performed using Ni-NTA columns. Two purified proteins 
could be eluted from these columns; one of approximately 58 kDa and thus representing the full- 
length glucoamylase-(hirulog) 5 fusion, and one of approximately 10.8 kDa which was in good 
agreement with the predicted size of the 100 amino acid polypeptide containing five tandem 
repeats of R3-hirulog. 

The smaller polypeptide (10.8 kDa) was purified by HPLC from the mixture of two proteins 
purified above using Ni-NTA columns from culture supernatant of dgr246GAKR(KHi J )H+ #14. 
The amino terminal sequence was determined to be GGGGN; i.e., it began with the third residue 
of R3-hirulog instead of the first residue as expected. Mass spectrometry was used to accurately 
determine the molecular weight of this polypeptide. A size which was consistent with the 
expected sequence of 100 amino acids minus the first two residues was determined. Thus, it was 
concluded that the expected 100 amino acid tandem repeat of R3-hirulog was released by the 
action of KEX2 proteinase from the glucoamylase-(hirulog) 5 protein, and that this polypeptide 
presumably lost the first two amino acid residues as a result of aminopeptidase action. 

Trypsin was used to digest separately the 10.8 kDa and 58 kDa proteins which had been purified 
on Ni-NTA columns from culture supernatant of strain dgr246GAKR(KHis)H+ #14. The peptide 
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fragments which were generated were separated by HPLC. A peptide which was generated from 
both the 10.8 kDA and 58 kDa proteins was identified and collected. Mass spectrometry 
analysis determined a molecular weight for this peptide which was consistent with R3-hirulog. 
The amino acid sequence of the amino-terminus of this peptide was determined as RPGGGN, 
confirming that it was R3-hirulog. 

The glucoamylase-hirulog fusion proteins produced either by transformant dgr246GA(KHi)5+ 
#1 1 after 3 days or by GAP3-4GA(KHi) 3 + after 2 days in Sheftone N medium was identified on 
Coomassie stained SDS-polyacrylamide gels by comparison with supernatant samples from 
untransformed control strains. The glucoamylase-(hirulog) 5 protein from these transformants was 
estimated to be 55 kDa, approximately 8 kDa larger than the truncated glucoamylase produced 
by transformant GAP3-4GA_3' #2 (47 kDa). This difference in size corresponds approximately 
to the expected size (95 amino acids, 10.2 kDa) of five repeats of R3-hirulog which were 
expected to be attached to the glucoamylase protein at the same position (amino acid 498) at 
which the glucoamylase product is truncated in transformant GAP3-4GAJT #2. 

Example 4 

Construction of the pG plasmid series, a series of expression vectors containing 
polypeptide-encoding sequences fused to the catalytic region of glucoamylase 

A series of expression plasmids (the pG series) was constructed, using methods known in the art, 
designed to produce fusion proteins consisting of desired polypeptides fused to the C-terminus 
of the catalytic region of Aspergillus glucoamylase. The expression cassette in this plasmid series 
contained the promoter and 5* flanking region of the glaA gene; the coding region of the glaA 
from the start codon to an engineered Nhel site immediately after codon number 468; a synthetic 
piece of double stranded DNA encoding the desired peptide with Nhel and BslEU ends; the 
terminator region of glaA ; the A. nigerpyrG gene; and the 3' flanking DNA from the glaA gene. 

To construct the pG series of plasmids it was first necessary to introduce an Nhel site at the 
desired position within the glucoamylase coding region. To do this we made use of the fact that 
there is a naturally occurring DraUI restriction site which interupts the glucoamylase coding 
region at the codon for amino acid 465 of mature glucoamylase. A plasmid derived from 
pGAKHi+ was cut with Drain (see Fig.9), the DNA ends were dephosphorylated by the action 
of alkaline phosphatase, and the larger of the two fragments containing the glaA promoter and 
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most of the glaA coding region was purified. In a separate reaction the same plasmid was cut 
with Dram plus Nhel and the second to largest DNA fragment containing the glaA terminator 
wdpyrG was purified. These two purified DNA fragments were ligated along with the synthetic 
piece of double stranded DNA obtained by annealing the following complementary 
oligonucleotides which had been phosphorylated at their 5' ends. The seqences of the two 
oligonucleotides were 5 '-GTGGCCGAGTG-3' and 5'-CTAGCACTCGGCCACGAG-3\ 

The above ligation created the plasmid pGHi (Fig.10) which encodes a fusion protein consisting 
of the prepro regions of glucoamylase, mature glucoamylase residues 1-468, and R3-hirulog. 
Plasmid pGHi contains unique Nhel and BstEll sites which can be utilized to insert a DNA 
fragment encoding any other desired polypeptide such that a fusion protein (comprising the 
glucoamylase catalytic region and the desired peptide) would be expected to be produced after 
transformation into Aspergillus strains. 

Plasmid pGHi was digested with Nhel and BstEH, the DNA ends were dephosphorylated with 
alkaline phosphatase and the large DNA fragment was purified. Plasmid pKl,K2X4,3 was cut 
with Nhel and BsfEll and the small fragment of DNA encoding five tandem repeats of R3- 
hirulog was purified. These two fragments of DNA were ligated to form pG(KM) } (Fig.10) 

Plasmids pGHi and pG (KHi)s were transformed into strains GCGAP3-4 or dgr246 P2 and the 
transformants were screened for glucoamylase production as described in the General Methods 
section. The following transformants were selected for further study. 



GAP3-4G(Hi)s#3 
dgr246GHi #5 
dgr246G(H3)s #7 

The glucoamylase-hirulog fusion protein produced by transformants dgr246GHi #5 and 
dgr246G(Hi) J #7 after 3 days, and by GAP3-4G(Hi) 5 #3 after 2 days, in Sheftone N medium was 
identified on Coomassie stained polyacrylamide gels by comparison with supernatant samples 
from untransformed control strains. As judged by electrophoresis, the protein from dgr246G(Hi) 5 
#7 and GAP3-4G(Hi) 5 #3 was approximately 59 kDa, which was 8 kDa larger than that from 
dgr246GHi #5 at approximately 42 kDa. This difference in size corresponds to approximately the 
expected size (76 amino acids, 8.2 kDa) of four additional repeats of R3-hirulog attached to the 
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glucoamylase-(hirulog)s protein. 

pxample S 

Construction of the pGAXS plasmid series, a series of expression vectors containing 
polypeptide-encoding sequences inserted into the linker region of glucoamylase 

A series of expression plasmids (the pGAXS series) was constructed, using methods known in 
the art, designed to produce fusion proteins consisting of desired polypeptides inserted into the 
linker region of full-length Aspergillus glucoamylase. The expression cassette in this plasmid 
series contained the promoter and 5' flanking region of the glaA gene; the complete coding 
region of the Aspergillus glaA gene with a synthetic piece of double stranded DNA (withJVM 
sites at both ends) encoding the desired peptide inserted into the naturally occurring Nhel 
restriction site at codons 497 and 498; the terminator region of glaA; the A. wgerpyrG gene; 
and the 3' flanking DNA from the glaA gene. 

The construction of this series of plasmids proceeded as follows. An approximately 340 bp 
fragment of DNA was isolated which represents the 3' end of the glaA coding region from an 
Nhel site at codons 497 and 498 of mature glucoamylase to a BsfEB. site 20 bp 5' of the 
translation stop codon which is present in the A.niger strain #7 glaA. Plasmid pGAKHH- was 
digested with Nhel and BslEO, the DNA ends were dephosphorylated, and the larger of the two 
fragments generated was purified. The above two DNA fragments were ligated to produce 
plasmid pGAgyrGV which contains the complete glaA coding region with a unique Nhel site 
within the portion encoding the glucoamylase linker region. 

Plasmid pGApyrGV was digested with Nhel and the DNA ends were dephosphorylated. Plasmid 
pKl,3K was cut with Nhel and the small fragment of DNA encoding RS-hirulog was purified. 
These two fragments of DNA were ligated to form pGAHiS (Fig.l 1). DNA sequence analysis 
confirmed that the peptide-encoding DNA fragment was inserted in the desired orientation. 

■ 

Similarly, the Nhel fragment of synthetic DNA encoding the HA epitope was isolated from 
plasmid pEpiN and inserted into the Nhel site of pGApyrGV to create pGAepiS (Fig. 1 1). DNA 
sequence analysis confirmed that the peptide-encoding DNA fragment was inserted in the desired 
orientation. 
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Plasmids pGAHiS and pGAepiS were transformed into strain GCGAP3-4 and the transformants 
were screened for glucoamylase production as described in the General Methods section. The 
following transformants were selected for further study. 

GAP3-4GAIBS #3 
GAP3-4GAepiS #1 

The glucoamylase-hirulog fusion protein produced by GAP3-4GAHiS #3 grown in Sheftone N 
medium was identified on Coomassie stained polyacrylamide gels by comparison with 
supernatant samples from untransformed control strain. This glucoamylase-hirulog product 
appeared to be similar to native A. niger glucoamylase product in that two protein bands were 
observed which were estimated to be 69 and 54 kDa in size. In this example, a single copy of R3- 
hirulog has been inserted into the linker of full-length glucoamylase. Since a protein the size of 
full-length glucoamylase was produced, it is clear that the embedded hirulog was also present in 
the fusion protein. 

The glucoamylase-HA epitope produced by GAP3-4GAepiS #1 grown in Clofine medium for 3 
days was detected on Western blots. As with the product of GAP3-4GAKS #3 above, the 
glucoamylase-HA epitope product appeared similar to the native glucoamylase product in that 
forms of the protein of a similar size to GAI and GAII were observed on Coomassie stained 
SDS-polyacrylamide gels. Both the larger and the smaller form of glucoamylase-HA epitope 

were detectable using the anti-HA antibody. 

Example 6 

Construction of pEpiGA, an expression vector containing a polypeptide-encoding 

sequence fused to the amino terminus of glucoamylase 

An expression plasmid (pEpiGA) was designed to produce a fusion protein consisting of the HA 
epitope fused to the N-terminus of residues 1-498 of mature Aspergillus glucoamylase. The 
expression cassette in this plasmid contained the following components: the promoter and 5* 
flanking region of the glaA gene; the coding region for the prepro regions of glucoamylase; a 
synthetic piece of double stranded DNA inserted at a BssW site encoding alanine, lysine, 
arglnine, the HA epitope peptide, and a methionine residue; the coding region for residues 1-498 
of mature glucoamylase, the terminator region of the glaA gene; the A. niger pyrG gene; and the 
3 * flanking DNA from the glaA gene. 
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The construction of this plasmid proceeded as follows. A plasmid derived from the pGA+ series 
was digested with HinSSSL and Sphl and a DNA fragment of approximately 2.2 kb was purified 
from an agarose gel. This fragment contains the glaA promoter and the first approximately 1 kb 
of the coding region of glaA. Within this fragment the naturally occurring BssHR cleavage site at 
the end of the coding region for the glucoamylase proregion is unique. Plasmid P Litmus38 (New 
England Biolabs) was digested with Hin&m and Sphl, the DNA ends were dephosphorylated 
with alkaline phosphatase, and the large DNA fragment was purified from an agarose gel. These 
two DNA fragments were ligated to form pLitGA5\ 

Plasmid pLitGA5' was digested with^ssffll, was dephosphorylated with alkaline phosphatase 
and the DNA fragment was purified. Plasmid P Lit29+Nepi was digested with BssHll and the 42 
residue oligonucleotide was purified from an agarose gel. These two DNA fragments were 
ligated to form pGA5'+Nepi. DNA sequence analysis confirmed that a single oligonucleotide had 
been inserted into the glaA coding region in the desired orientation. 

Finally, pGA5'+Nepi was digested with /f/ndlll and Sphl and the DNA fragment containing the 
glaA promoter and first 1 kb of coding region with inserted oligonucleotide was purified from an 
agarose gel. Plasmid pGAA3' was digested with #//idIII and Sphl and the largest resulting DNA 
fragment (containing the rest of the glaA coding region, the glaA terminator region and the 
inserted pyrG gene) was purified. Plasmid pGAA3' was digested with HindW; the DNA ends 
were dephosphorylated, and smaller of the two resulting DNA fragments (comprising the 
pSLl 180 vector) was purified. These three DNA fragments were ligated to form pEpiGA (Fig. 
12). 

The amino acid sequence of the glucoamylase produced by Aspergillus transformants harboring 
this expression vector would be expected to consist of the authentic signal sequence (which 
would be removed upon entry into the endolasmic reticulum of the cell), the authentic 
prosequence (which would be removed within the secretory apparatus of the cell by KEX2- 
mediated cleavage), alanine, lysine and arginine residues (which should also be removed within 
the secretory apparatus of the cell by KEX2), HA epitope followed by a methionine residue 
(which could be cleaved on the C-terminal side by cyanogen bromide treatment to release the 
epitope with attached methionine) and residues 1 -498 of mature glucoamylase. 
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Plasmid pEpiGA was transformed into strain dgr246 P2 and the transformants were screened for 
glucoamylase production as described in the General Methods section. The following 
transformant was selected for fiirther study. 



dgr246 EpiGA #12 

Transformant dgr246 EpiGA #12 was cultured in Sheftone N medium for 3 days. Western 
analysis using the anti-HA antibody demonstrated the presence of the HA epitope-glucoamylase 
fusion protein. 

Example 7 

Construction of vectors for expression of CBHI-hirulog fusion proteins in Trichoderma 

longibrachiatum 

T. longibrachiatum cellobiohydrolase I, CBHI, is a secreted protein which has a 17 amino acid 
secretion signal sequence and a mature protein of 496 amino acids. The mature protein is 
composed of two separately folded domains, the catalytic core comprising residues 1-437 and 
the cellulose binding domain (CBD) comprising residues 461-496. These two domains are 
separated by a linker region, residues 434-460. 

The cbhl gene (encoding CBHI) has been cloned previously (Shoemaker et al., 1983, 
Bio/Technology 1:691-696). The entire coding region for CBHI was subcloned from genomic 
DNA as an approximately 1 .75 kb SsiU-Xmal fragment into SstJl-Xmal digested pSLl 1 80. The 
&/n site exists 15 bp 5' of the translation start codon and the Xmal site exists 76 bp 3' of the 
translation stop codon. The resulting subclone was designated as pEN601 . 

Vector pEN601 was modified by replacing the CBHI cellulose binding domain (CBD) encoding 
region with the restriction enzyme sites for Nhel and BsiEU to allow subsequent insertion of the 
peptide encoding sequences. In addition, a naturally occurring BsiEU site in the DNA encoding 
the C- terminal portion of the catalytic core region of CBHI was inactivated. All these alterations 
were done in a single step. The vector pEN601 was digested with BsiEU, which cleaved the 
vector within the cbhl sequence. This 2fc/EII-cut DNA was treated with mung bean nuclease to 
the single-stranded DNA ends resulting in blunt ends and destruction of the BsiER site. This 
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linear, blunt-ended pEN601 was farther digested with^gel, which cuts within the cbhl 
transcription terminator at a position 23 bp 3' to the stop codon, resulting in removal of the 
sequences coding for the CBD. 

A DNA fragment was made by PCR amplification of cbhl template DNA with primers pepl and 
pep2. The 37-mer pepl primer coded for a Pmli site (after cleavage with PmR the blunt-ended 
DNA produced was designed to be ligated with the blunted BstEU site mentioned above) 
followed by codons for the approximately 8 amino acids which naturally exist 3* to the BstEU 
site in the DNA encoding CBHL The codon usage for the first two amino acids after the BstEU 
site, valine and threonine (residues 416 and 417 of mature CBHI), was designed to be altered 
from GTC to GTG and ACC to ACT, respectively, in order to not recreate the BstEU 
recognition site. These changes would not have altered the amino acid sequence in this part of 
CBHI. However, after PCR it was observed by DNA sequence analysis that the first codon after 
the BstEU site had been changed from GTC to GCG, changing amino acid number 416 of mature 
CBHI from valine to alanine. The 53-mer pep2 primer had 23 nt at the 3\ end which were 
complementary to the end of the cbhl linker region (up to codon number 459 of mature CBHI) 
with JVftel and BstEU sites for subcloning of polypeptide encoding sequences at the end of the 
cbhl linker region. The 51 end of pep2 had an Agel cleavage site for insertion of the PCR- 
generated fragment (170 bp) into pEN601. The primer sequences were as follows. 

pepl: 51-CTT CAC ACG TGA CTT TCT CCA ACA TCA AGT TCG GAC C 

pep2: 51-AAT CTA CCG GTG GTC ACC GCG CTT GCT AGC TCC GGG AGA GCT TCC 
AGTGGTAG 

The PCR amplified 170 bp insert was digested with Pmtl and Agel and cloned into BstEU and 
Agel digested pEN601, of which the BstEU site was blunt-ended as described above. The 
resultant vector, earring the modified cbhl sequence, was designated as pEN602. 

R3-Hirulog encoding sequences were excised from pKl,K2x4,3 and pA3 (see Example 1) by 
digestion with Nhel and BstEU and were ligated into NheVBstEU cut pEN602. The resultant 
vector, with a coding sequence for CBHI (catalytic core and linker regions)-five repeats of R3- 
hirulog was designated as pEN603 (5.1 kb) and the vector for the shorter fusion protein, i.e., 



PCT/US98/00474 

WO 98/31821 

„ 44 - 

with a single R3-hirulog unit, instead of five, was designated as pEN604 (4.9 kb) (Fig. 3). In 
each case, the DNA encoded CBHI up to residue 459, followed by alanine and serine residues. 
The R3-hirulog peptides were separated from CBHI and from each other by lysine residues. 

A Trichoderma expression vector, pTIC, was assembled by methods known in the art so that the 
following components were present as an expression cassette which could be excised from the 
plasmid by EcoKl digestion (Fig. 14). The expression cassette was located between the EcoBl 
sites of P SP73 (Promega). The components of the expression cassette were the ciWS'flanking 
region and promoter sequences from an EcdRl site approximately 2 kb 5' to the translation start 
codon to the SstU site 15 bp 5' to the translation start codon; a synthetic piece of DNA encoding 
an end compatible with Ssffl digested DNA a Pmel restriction site and an end compatible with 
Xmal digested DNA the cbhl 3' flanking region from the Xmal site 76 bp 3' of the translation 
stop codon to a BgM site approximately 1 .25 kb 3' to the translation stop codon; a synthetic 
piece of DNA with one end compatible with Bgtll digested DNA and the other end compatible 
with Nhel digested DNA the T. longibrachiatum pyr4 gene on a 1.7 kb Nhel to Sphl DNA 
fragment (Smith et al., 1991, Curr. Genet. 19:27-33); a synthetic piece of DNA with one end 
compatible with Sphl digested DNA and the other end compatible with BgUl DNA the cbhlV 
flanking region from the Bgfll site approximately 1 .25 kb 3' to the cbhl translation stop codon 
to a Psa site approximately 2.6 kb 3' to the translation stop codon; part of the multiple cloning 
site of pSLl 1 80 from the PsH site to the EcoVl site. An SstU site naturally occurs within the 
coding region of thepyr4 gene at around codon number 251 . This site had been altered (codon 
251 changed from CGC to CGA) by site directed mutagenesis so that it was no longer 
recognized by SstU but without changing the encoded amino acid sequence. 

pTIC was digested with Ssto and Pmel, which are adjacent sites and open the vector between 
the cbhl promoter and terminator sequences for the subsequent insertion of desired peptide 
coding sequences. The coding sequences for CBHI+5 R3-hirulog units and CBHI+single 
R3-hirulog unit were cut from pEN603 and pEN604, respectively, by SsOL-SmdL double 
digestion. The Trichoderma expression vector for the longer fusion protein (i.e. with 5 
R3-hirulog units) was designated as pEN605 and the expression vector for the CBHI-single 
R3-hirulog unit as pEN606 (Fig. 15). 



PCT/US98/00474 

WO 98/31821 

- 45 - 

Examelfi 14 

Construction of a vector for expression of CBHI-hemagglutinin epitope fusion protein in 

Trichoderma longibrachiatum 
A Trichoderma expression vector for production of CBHI (catalytic core and linker region)- 
hemagglutinin (HA) epitope was constructed in two steps. In the first step the epitope coding 
sequence, isolated from pEpiNR digested with Nhel and BsfEU, was ligated into pEN602 which 
was opened by Nhel and BstEU. This first vector carving the coding sequence for CBHI-HA 
epitope was designated as P EN701 (Fig. 13). In the second and final step the CBHI-HA epitope 
insert was obtained from pEN701 by Sstll and Smal digestion and ligated into SsM-Pmel opened 
pTIC (Fig. 14). This Trichoderma expression vector for production of CBHI-HA epitope under 
control of the chhl promoter was designated as pEN702 (Fig. 15). 

Example 15 

Transformation of Trichoderma longibrachiatum 
The T. longibrachiatum host strain used for polypeptide expression was T. longibrachiatum 
strain 1 A52P13, which was derived from strain RL-P37 by inactivating the genes for all major 
cellulases; CBHI, CBHEL, EGI and EGII (described in US Patent 5,472,864). This strain requires 
exogenous uridine for growth due to inactivation of the/aW gene. 

The host strain, 1 A52P13, was cultivated overnight in Vogel's medium supplemented with 2 
mg/ml uridine and 1% glucose. Mycelium was centrifoged in 50 ml conical tubes in a clinical 
centrifuge in full speed for 10 minutes and washed twice with 1 .2 M MgSO«, 10 mM Na- 
phosphate, pH 5.8 and the supernatant was discarded. The washed mycelium was resuspended in 
Novozyme solution [200 mg of Novozym 234 (Interspex Products, Foster City, CA) in 40 ml of 
1.2 MMgSO* 10 mMNa-phosphate, pH 5,8; filter-sterilized]. The mycelial suspension was 
shaken in Novozyme solution at 150 rpm on an orbital shaker at 28 °C for 1 to 1.5 h, and 
formation of protoplasts was monitored using a microscope. After the incubation, the protoplast 
solution was poured into sterile, clear Oakridge tubes, filling the tubes half foil. One volume of 
0.6 M sorbitol, 0.1 M Tris-HCl, pH 7.0 was pipetted slowly on top of the protoplast suspension. 
The tubes were centrifoged, without brakes, in an HB-4 rotor (Sorvall) at 4000 rpm, for 15 
minutes, at 4 X. Protoplasts, which formed a cloudy layer between the two phases, were 
collected into clear Oakridge tubes using plastic, disposable Pasteur pipettes. The protoplast 
suspension was mixed with one volume of 1.2 M sorbitol buffer (1.2 M sorbitol, 10 mM CaCI 2 , 
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10 mM Tris-HCl, pH 7.5) and centrifuged in an HB-4 rotor at 4500 rpm, for 5 minutes. 
Protoplasts were washed once more with 4-6 ml of 1 .2 M sorbitol buffer. The protoplast pellet 
spended in 300 - 1000 ul of 1.2 M sorbitol buffer and protoplasts were counted using a 
hemocytometer. Protoplasts were diluted to yield 10 7 - 10« protoplasts per ml of buffer. 20 ul of 
DNA in TE buffer (TE was used as a control) was mixed with 200 ul of protoplast suspension, 
followed by addition of 50 ul of PEG solution (25% PEG 6000, 50 mM CaCl 2 , 10 mM Tris-HCl, 
pH 7.5). This transformation mixture was placed on ice for 20 minutes. After incubation on ice, 2 
ml of PEG solution was added, followed by 4 ml of 1.2 M buffer. This transformed protoplast 
solution (in 100, 500 or 1000 ul aliquots) was added to melted soft agar medium (Vogel's) and 
poured on Vogel's plates. Both the soft agar and plates were osmotically stabilized by 1 .2 M 
sorbitol. Plates were incubated at 30 °C. 

Transformation of T. reesei 1A52P13 with P EN605 resulted in two transformants expressing 
CBHI-hirulog(5 units) fusion protein. These transformants were designated as 605.5 and 605.23. 
Transformation with pEN606 resulted in five strains (606.9, 606.15, 606.17, 606.24 and 606.25) 
which produced CBHI-hirulog (1 unit) fusion protein. Transformants shown to produce CBHI- 
hemagglutinin epitope were designated as 702.1, 702.6 and 702.10. The best transformants, as 
determined by the intensities of the product in Commassie stained gel, were analysed further. 

Example 16 

Cultivation of peptide-producing transformants of Trichoderma longibrachiatum 
Transformants were maintained on Vogel's medium. Baffled shake flasks with 50 ml of either 
lactose or Proflo medium were inoculated with agar blocks of fungal culture. The lactose 
medium consisted of (g/ 1): lactose 10.0; peptone 2.0; yeast extract 1 .0; KH 2 P0 4 15.0; 
(NH4) 2 S0 4 2.0; MgSO 4 .7H 2 0 0.3; CaCl 2 .2H 2 0 0.3; trace metal stock solution 1 .0 ml/ 1. Proflo 
medium is composed of (g/ 1): Proflo 22.5; lactose 30.0; (NH^SO. 6.5; KH 2 P0 4 2.0; 
MgS0 4 .7H 2 0 0.3; CaCI 2 0.2; CaCO, 0.72; trace metal stock solution 1.0 ml/ 1 and 10% Tween 
80 2.0 ml/ 1. The trace metal stock solution used in both media had (g/ 1): FeS0 4 .7H 2 0 5.0; 
MnS0 4 .H 2 0 1.6; ZnS0 4 .7 H 2 0 1.4; CoCl 2 .6 H 2 0 2.8. 



Transformants were cultivated for six days at 30°C and samples were drawn daily for analysis. 
Culture supernatants were analyzed using Coomassie stained SDS-PAGE and Western analysis 
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as described for analysis of Aspergillus supernatant*. The supernatant samples were concentrated 
approximately four fold using Centricon-10 (Amicon) concentrators. 

It was possible to identify the CBHI-polypeptide fusion proteins produced by transformants by 
comparison with culture supernatant from strain 1 A52P13, which produced no CBHI protein, or 
from a strain of T. reesei deleted for the genes encoding EGI and EGII, which produced native 
CBHI protein. Samples of supernatant from the R3-hirulog-producing strains 605.5 and 606.17 
were analyzed by SDS-PAGE. The CBHI protein product from 16 hour-old cultures of both 
605 5 and 606.17 had a higher molecular weight as determined by SDS-PAGE than the CBHI 
protein from later timepoints, which were of the same size as native CBHI from the control 
strain This reduction in size after the first day of cultivation was particularly clear for the fus,on 
protein carrying five R3-hirulog units, because this fusion protein was 6.5 kD larger than CBHI. 
This visible reduction in size, observed by the second day of culture, suggested degradafon of 
the fusion protein. 

Supernatant samples from the CBHI-hemagglutinin epitope producing transformant 702.1 from 
different timepoints during culture were run on SDS-PAGE, blotted on nitrocelluose and 
analyzed with the monoclonal antibody specific to hemagglutinin epitope. A signal was observed 
in samples taken after 16 hours of culture. However, a weaker signal was observed with samples 
from 2 day old cultures and no signal was observed from the third day of culture and beyond, 
even though a Coomassie stained gel showed that the later samples contained a protein product 
the same size as CBHI. 
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WHAT TS CLAIMED IS : 

1. A fusion nucleic acid encoding a fusion polypeptide comprising, from a 5' end of said 
fusion nucleic acid, first, second, third and fourth nucleic acids, wherein said first nucleic acid 
encodes a signal polypeptide functional as a secretory sequence in a first filamentous fungus, 
said second nucleic acid encodes a secreted polypeptide or functional portion thereof normally 
secreted from said first or a second filamentous fungus, said third nucleic acid encodes a 

< 

cleavable linker and said fourth nucleic acid comprises two or more nucleic acids each 
encoding desired polypeptides. 

2. A fusion nucleic acid according to claim 1 wherein said two or more nucleic acids encode 
the same polypeptide. 

* 

3. A fusion nucleic acid according to claim 1 wherein at least two of said nucleic acids encode 
different desired polypeptides. 

4. A fusion nucleic acid according to claim 1 wherein said fourth nucleic acid further 
comprises at least one nucleic acid encoding a cleavable linker, wherein said nucleic acids 
encoding said desired polypeptides are separated by said nucleic acid encoding said cleavable 
linker. 

5. A fusion nucleic acid according to claim 4 wherein said fourth nucleic acid comprises at 
least three nucleic acids encoding desired polypeptides and at least two nucleic acids encoding 
cleavable linkers, wherein each of said nucleic acids encoding desired polypeptides is separated 
by one of said nucleic acids encoding a cleavable linker. 

6. A fusion nucleic acid according to claim 4 wherein said fourth nucleic acid comprises at 
least four nucleic acids encoding desired polypeptides and at least three nucleic acids encoding 
cleavable linkers, wherein each of said nucleic acids encoding desired polypeptides is separated 
by one of said nucleic acids encoding a cleavable linker. 

7. A fusion nucleic acid according to claim 5 wherein said fourth nucleic acid comprises at 
least four nucleic acids encoding desired polypeptides and at least three nucleic acids encoding 
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cleavable linkers, wherein each of said nucleic acids encoding a cleavable linker is positioned 
between nucleic acids encoding desired polypeptides. 

8. A fusion nucleic acid encoding a fusion polypeptide comprising, from a 5' end of said 
fusion nucleic acid, first, fifth, third and second nucleic acids, wherein said first nucleic acid 
encodes a signal polypeptide functional as a secretory sequence in a first filamentous fungus, 
said second nucleic acid encodes a secreted polypeptide or functional portion thereof normally 
secreted from said first or a second filamentous fungus, said third nucleic acid encodes a 
cleavable linker and said fifth nucleic acid comprises at least one nucleic acid encoding a 
desired polypeptide. 

9. A fusion nucleic acid encoding a fusion polypeptide comprising: 

a) a first nucleic acid encoding a signal polypeptide functional as a secretory sequence 
in a first filamentous fungus; operably linked to 

b) a second nucleic acid comprising first and second parts together encoding a 
secreted polypeptide or functional portion thereof normally secreted from said first or a 
second filamentous fungus; and 

c) an insertion nucleic acid between said first and second parts comprising a fifth 
nucleic acid flanked by third nucleic acids; 

wherein said third nucleic acids encode cleavable linkers and said fifth nucleic acid comprises 
at least one nucleic acid encoding a desired polypeptide. 

10. A fusion nucleic acid encoding a fusion polypeptide comprising, from a 5' end of said 
fusion nucleic acid: 

a) a first nucleic acid encoding a signal polypeptide functional as a secretory sequence 
in a first filamentous fungus; operably linked to 

b) a second nucleic acid comprising first and second parts and encoding a secreted 
polypeptide or functional portion thereof normally secreted from said first or a second 
filamentous fungus, wherein said secreted polypeptide contains a linker region; and 

c) an insertion nucleic acid between said first and second parts comprising a fifth 
nucleic acid flanked by third nucleic acids, wherein the insertion point is within said 
linker region; 
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wherein said third nucleic acids encode cleavable linkers and said fifth nucleic acid comprises 
at least one nucleic acid encoding a desired polypeptide. 

1 1. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said first nucleic acid encodes 
a signal polypeptide or portion thereof selected from the group consisting of signal 
polypeptides from glucoamylase, a-amylase, and aspartyl protease from Aspergillus species, 
signal polypeptides from bovine chymosin and human tissue plasminogen activator and signal 
polypeptides from Trichoderma cellobiohydrolase I and n. 

12. A fusion nucleic acid according to Claim 1 1 wherein said first nucleic acid encodes the 
signal polypeptide from Aspergillus awamori glucoamylase. 

13. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said second nucleic acid 
encodes a secreted polypeptide selected from the group consisting of glucoamylase, a- 
amylase, and aspartyl protease from Aspergillus species and Trichoderma cellobiohydrolase I 
and II. 

14. A fusion nucleic acid according to Claim 13 wherein said second nucleic acid encodes at 
least the catalytic domain of glucoamylase from Aspergillus awamori. 

15. A fusion nucleic acid according to Claim 13 wherein said second nucleic acid encodes at 
least the catalytic domain and part of the linker region of glucoamylase from 
Aspergillus awamori. 

16. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said second nucleic acid 
encodes glucoamylase from Aspergillus awamori. 

17. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said third nucleic acid encodes 
a cleavable linker selected from the group consisting of the prosequence from chymosin, the 
prosequence of subtilisin, methionine, and sequences recognized by trypsin, factor X„ 
collagenase, clostripain, subtilisin and chymosin. 



PCTAJS98/00474 

WO 98/31821 

- 51 - 

18. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said cleavable linker is a 
sequence recognized by trypsin. 

19. A fusion nucleic acid according to Claim 18 wherein said sequence is a lysine residue. 

20. A fusion nucleic acid according to Claim 18 wherein said sequence is lysine-arganine. 

21. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said third nucleic acid encodes 
the prosequence of chymosin or a portion thereof. 

22. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said fourth nucleic acid 
encodes a desired polypeptide selected from the group consisting of enzymes, hormones, 
growth factors, cytokines, and serum proteins. 

23. A fusion nucleic acid according to Claim 1, 8 or 9 wherein said fourth nucleic acid 
encodes R3 -hirulog. 

24. An expression vector for transforming a host filamentous fungus comprising nucleic acids 
encoding regulatory sequences functionally recognized by said host filamentous fungus 
including promoter and transcription and translation initiation sequences operably linked to the 
5' end of the fusion nucleic acid of Claim 1, 8 or 9 and transcription stop sequences and 
polyadenylation sequences operably linked to the 3' end of said fusion nucleic acid. 

25. The expression vector of Claim 24 wherein said first and said second nucleic acids 
encoding respectively said signal polypeptide and said secreted polypeptide are selected from 
filamentous fungi of the same genus as said host filamentous fungus. 

26. The expression vector of Claim 24 wherein said genus is selected from the group 
consisting of Aspergillus, Trichoderma, Neurospora, Pemcillium, Cephalosporin,, 
Podospora, Endothia, Mucor, Cochliobolus, Pyricularia. Achlya, Fusarium and Humicola. 

27. The expression vector of Claim 26 wherein said genus is Aspergillus. 
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28. The expression vector of Claim 24 wherein said first and said second nucleic acids 
encoding respectively said signal polypeptide and said secreted polypeptide are from said host 
filamentous fungus. 

29. A filamentous fungus containing a fusion nucleic acid according to claim 1 , 8 or 9. 

30. A filamentous fungus containing an expression vector according to claim 24. 

31. A fusion polypeptide comprising, from a 5' end of said fusion polypeptide, first, second, 
third and fourth amino acid sequences, said first amino acid sequence comprising a signal 
polypeptide functional as a secretory sequence in a first filamentous fungus, said second amino 
acid sequence comprises a secreted polypeptide or functional portion thereof normally 
secreted from said first or a second filamentous fungus, said third amino acid sequence 
comprises a cleavable linker and said fourth amino acid sequence comprises two or more 
desired polypeptides. 

32. A fusion polypeptide comprising, from a 5' end of said fusion polypeptide, first, fifth, third 
and second amino acid sequences, wherein said first amino acid sequence comprises a signal 
polypeptide functional as a secretory sequence in a first filamentous fungus, said second amino 
acid sequence comprises a secreted polypeptide or functional portion thereof normally 
secreted from said first or a second filamentous fungus, said third amino acid sequence 
comprises a cleavable linker and said fifth amino acid sequence comprises at least one desired 
polypeptide. 

33. A fusion polypeptide comprising, from a 5' end of said fusion polypeptide: 

a) a first amino acid sequence comprising a signal polypeptide functional as a 
secretory sequence in a first filamentous fungus; 

b) a second amino acid sequence comprising first and second parts and comprising a 
secreted polypeptide or functional portion thereof normally secreted from said first or a 
second filamentous fungus; and 

c) an insertion amino acid sequence between said first and second parts comprising a 
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fifth amino acid sequence flanked by third amino acid sequences; 
wherein said third amino acid sequences comprise cleavable linkers and said fifth amino acid 
sequence comprises at least one nucleic acid encoding a desired polypeptide. 

34. A process for producing a desired polypeptide comprising transforming a host filamentous 
fungus with an expression vector containing the fusion nucleic acid of Claim 1 , 8 or 9 under 
conditions which permit expression of said fusion nucleic acid to cause the secretion of the 
desired polypeptide encoded by said fusion nucleic acid. 
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